ATTENDING
- Jerry Sheehan, Pol Llovet, Aurelien Mazurie, Thomas Heetderks
ABSENT
MINUTES: HPCAG Meeting #2
- Welcome & Introduction - Jerry
- Queue Limits & Optimization - Pol
(handout: Hyalite Queue Specification)
- (disclaimer) for our initial SLURM queue configuration– we started simple, with NO
job limits.
- because of the way SLURM functions, the tyranny of long running jobs means we need
to establish job limits.
- we have these queue job limit recommendations (see the handout)– we need your imput
on these recommendations.
- these changes will go into effect after the February 4 maintenance window (ACTION
ITEM).
- discussion: can we setup job monitoring (to identify jobs running a long time without
much activity)? -yes (ACTION ITEM).
- once these queue job limits are implemented– we can adjust these limits as necessary.
- Storage Strategy - Pol
(handout: Research Storage Technical Strategy)
- we are acquiring a backup appliance for on-site backup storage–
- we will be spending about $24K to purchase this appliance within the next 2 weeks
(from January 28).
- this will probably live in the Renne Data Center (on-site, but in a different building).
- this will provide about 50TB of storage to start– this will be expanded and we'll
probobly use compression for more capacity.
- this will be highly durable storage with bit-rot protection.
- we are in negotians with Indiana University to provide off-site backup storage–
- (Jerry) this is right now near complete– we are finalizing legal language for data
rights retention.
- we will get about 50TB of off-site storage (again, we will probably use compression).
- we currently backup your Hyalite HOME directories on a nightly basis (to a space on
the Lustre file system)– these are the basis for future on-site and off-site backups.
- (Jerry) we have 3 primary storage needs–
- deep archival/backup storage– we are doing this right now (ACTION ITEM).
- project storage– we will execute an RFP to acquire this storage down the road.
- instrament backup for big data from labs– we will address this at a later point.
- MATLAB Licensing Opportunity - Pol
- we have an opportunity to acquire a campus wide MATLAB Site License for $47K.
- this license will allow MSU to–
- install MATLAB on all faculty/student/staff computers and on Hyalite.
- access 16 of the most popular MATLAB toolboxes including all that we currently use
and the parallel computing toolbox.
- we intend to request CFAC funding for this MATLAB License.
- we would like the endorsement of this group (to add to our other endorsements) for
this request (ACTION ITEM).
ACTIONS
- RCi will write a MATLAB Site License letter of endorsement for your signitures
- RCi will move these job queue limits into production during the February 4 maintenance
window
- RCi will check with BIOSIT about setting up job monitoring
- Within two weeks (of January 28) RCi will purchase a storage appliance for backup/archival
stroage
FUTURE AGENDA
University Information Technology
P.O. Box 173240
Bozeman, MT 59717-3240
UIT Service Desk
Tel: 406-994-1777
[email protected]
www.montana.edu/uit/servicedesk
UIT Service and Support Portal
Location: Renne Library, 1st floor Room
115G
HOURS:
Monday - Thursday, 8 a.m. - 5 p.m.
[In Person & via email/phone/remote]
Monday - Thursday, 5 p.m. - 7 p.m.
[Remote ONLY]
Friday, 8 a.m. - 5 p.m.
[In Person & via email/phone/remote]
excluding holidays & breaks
Vice President for IT & CIO:
Dr. Ryan Knutson
[email protected]