Brief meeting notes containing DECISIONS and ACTIONS --Slovenian feedback: GPGPUs (many users from deep learning). Some extension in XRSL might be necessary. Would be nice to provide a High Availability CE, failover mode advanced usage of singularity Wish for the return of the standalone client Wosh for a MAC client for ARC 6 --ACT workflow manager (ACT for users) rest interface to ACT (rest) as planned currently single users feature set to be discussed: a vision could be something like a dropbox for jobs -- release manager report - what to do with documentation for ARC 6? This was addressed later the week. -- xrootd/https/gridftp protocols: xrootd in ARC is very slow, ARC implementation issue. bug filed already. Action: list the issue under know issues unless the bug gets fixed Action: Switch the production NDGF storage element to https NOW ( a pending action from Tromso) -- Scalability concerns and bottlenecks in the ARC data handling code bug to be reported about the data thread congestion problem. we need a new, separate data staging system/architecture. liberate DTS as a separate service. -- ACT updates (David): see slides. it can now submit to Condor! -- Maiken Elastic Cluster & ACT local plugin -- EMI-ES status see: http://wiki.nordugrid.org/wiki/EMI-ES_and_ATLAS DECISION: walltime and totalwalltime will both be consistently returned by the client sdk. This is an incompatible modification, therefore only be part of ARC 6. -- Containers & ARC: an example of Singularity usage at Sherlock cluster: http://sherlock.stanford.edu/mediawiki/index.php/Containers ACTION (Florido): First we need to write a one-page "vision" on how containers could be launched on ARC CE and what a user/ARC is supposed to do in that workflow. Then, we can try to match it with available tools in ARC such as the RTEs. -- GIT DECISION: we do go for git. we adapt to the master + major version x, major version y branch model (see picture and Maiken's text). At least at the begining we require merge requests & a dedicated dev branch for even a signle-liner code fix. -- ARC release concept, versioning An ARC release will consist of: - core (server side) - client, sdk - metapackages - release note Decision: The release number will be identical with the git tag (like 6.0.0) of the core ARC. We Said goodbye to 17.02.u1 type versioning. DECISION: Release note will be in the top source dir, also have a separate dir for the older release notes DECISION: Documents will go to web structured clearly by major versions (e.g. separate sub-page, dir for release 5, release 6) Decision: create a documentation task force for Release 6. Chair Florido. First task of the group is to come with a suggestion where to put the ARC 6 draft docs. Decision: Nagios is offloaded to Neic, separate product. Decision: jura_to_es: separate git repo, will not be released with ARC Decision: gangliarc separate repo in git. separate release cyle. side-decision: gangliarc block does NOT stay in arc.conf --ARC 6 packaging & dependencies ACTION: Mattias reviews the current packages and comes with a new proposal for ARC 6: package names, content. ACTION: rename nordugrid-arc-ldap-monitor to something else (since it contains archery code as well). Remove ws-monitor package. -- LRMS for ARC 6 DECISION: there'll be a fork pyton-lrms to be delivered by Christian. No Condor. a PBS bug and Condor core miscalculation will be fixed. Some issues with multiple SLURM partitions. -- RTEs for ARC 6 Decision: system RTEs (eg. env-proxy) will be distributed as part of the main arc package. Decision: turning on/off system-installed RTEs will be done using simlinks. Andrii provides solution. Anders raises the question of "order of RTEs". Aleksander claims it is OK, the xrls order is treated/kept consistently. ACTION: remove the dead RTE link from the www.nordugrid.org DECISION: we introduce RTE parameters. extend XRLS. details to be postd by Andrii. candidate for ARC 6. Known issues: no trust in RTE, namespace conflicts may occure -- Defaults for ARC 6 not much progress. some defaults may need to be agreed, see ACTIONS here: http://svn.nordugrid.org/trac/nordugrid/browser/arc1/branches/6.0/src/services/etc/arc.defaults.conf ACTION: dedicated skype call to go through the ACTIONS in defaults file. no magic, no breakthrough for ARC 6 --performance logging code DECISION: remove option from arc.conf.reference and remove from arex startup script. - infosys lrms modules: SLURM.pm and SLURMmod.pm DECISION: not feasible to do major changes/cleanup for ARC6. Florido will add disclaimers in the code. This will open up a new development for ARC7. -- arc.conf: enable by blocks: everybody claims it is done. DECISION: Olexander takes over the incomplete JURA feature(s): archivettl ACTION: remove EGIS related blocks DECISION: already in 5.5 we mark glue1.3 schema support as DEPRECATED. ACTION: turn on validator as part of the startup scripts NOW. -- zero.conf use-case 1: apache-like out-of-box service start up right after package install defaults for arc.conf LRMS: fork sessiondir (1777): /var/spool/arc/session controldir: /var/spool/arc/control user (for the mapping): nobody daemons running as root no cache no ldap emi-es for job and info no gridmapfile, mapping to be done in arc.conf user mapping: everybody to nobody hostname (also for EMI-ES endpoint): localhost no certificates (special feature of EMI-ES, never tried even by Aleksandr) minimal cluster & queue info block ACTION: create an arc.conf instance (named demo-arc.conf) with the above values, the package installer places that into /etc/arc.conf use-case 2: installer/configurator to pick proper blocks. this is postponed untill we better understand arc.conf blocks & dependencies. ARC 7 candidate ACTION: check all the other conf templates decide if those are still needed. -- ARCHERY for 6.0 packaging: client part of the arc-client, part is in the monitor package. separate subpackage for archery-manager called nordugrid-arc-archery-manage -- EGIIS retirement EGIIS support stays in client-sdk but marked as DEPRECATED. Monitor till can fetch info from old deployed EGIIS servers. BUT: EGIIS server won't be part of ARC 6, neither cluster registration to EGIIS. ACTION: Mattias takes care of server-side cleanup. Balazs takes care of arc.conf.reference DISCUSSION TOPIC for future: authorized access to job/cluster/queue info vs. old-style open access ala LDAP. -- Testing for 6.0 status as of Today: some Nagios tests exists in Kosice (not clear what they do), also the standard set of EGI probes and NEic tests. There is an old list of EMI-test cases with nice numbering scheme and with most probably irrelevent content. TODO: we need a numbered list of well-defined test cases test case is defined by: server-side arc.conf, deployment instructions, client-side commands to execute, expected output/behaviour. then, we would need a nice platform to execute functionality tests and collect & visualize results. DECISION: atlas hammercloud will be used before every release as a stress-test. the candidate releases will be deployed on Maiken's cluster. We can have several types of tests defined (e.g. many small files, or few large files) ACTION: investigate if we could move tests from nagios over to jenkins or vice versa. (see the results of the tests) --ARC 6 timeline incomplete: event-driven, jura-config, egiis-removal, some-defaults, some-infosys-config, rte on-off, rest-interface python-fork (non-blocker), startup-scripts, globus-memory-leak, arc-demo.conf, new packaging, removal of OBSOLETED code first alpha-release by end of January 2018. then, we need to fix as well: documentation testing, new release candidates for 2 months -> optimistic ARC 6.0 release date: April 2018 ACTION: Balazs populates the tracker. (what, who, time estimate) Decision: Release preparation follow-up skype call: in two weeks. --overall code cleanup non-exclusive list: CREAM, unicore, windows related, solaris, JAVA bindings, BES, EMIR, JDL, JSDL, ws-monitor, old uploaders/downloaders