2020 RCD CM Community Dataset report available

Scatter Plot showing the capabilities coverage for all 41 institutions

The report describes the first Research Computing and Data Capabilities Model Community Dataset, aggregating the assessments of 41 Higher Education Institutions. These assessments were completed using the 1.0 version of the Research Computing and Data Capabilities Model (RCD CM), over a period of several months in the Spring and Summer of 2020. This Community Dataset provides insight into the current state of support for RCD across the community and in a number of key sub-communities. Download it now! (zenodo.org/record/4344057).

Update August 2021: See also our PEARC21 paper (which won the Best Full Paper award for the Workforce Development, Training, Diversity, & Education track): Assessing the Landscape of Research Computing and Data Support: The 2020 RCD Capabilities Model Community Dataset (also available here if you don’t have ACM access).

The Research Computing and Data Capabilities Model allows institutions to assess their support for computationally- and data-intensive research, to identify potential areas for improvement, and to understand how the broader community views Research Computing and Data support. The Model was developed by a diverse group of institutions with a range of support models, in a collaboration among Internet2, CaRCC, and EDUCAUSE. This Assessment Tool is designed for use by a range of roles at each institution, from front-line support through campus leadership, and is intended to be inclusive across small and large, and public and private institutions. 

We encourage you to check out the Capabilities Model, and begin to use it at your institution. Start with the Capabilities Model Introduction and Guide to Use, which includes background as well as tips for using the model, and a link to the access request form that will create a personalized copy of the Assessment Tool for your institution.  You can also watch the recording of the EDUCAUSE webinar. Keep an eye on the RCD CM working group page for more information and updates.

Welcome to February 2021!

Please see below for People Network calls this month, and make sure to join in! (For handy calendar entries, see the CaRCC Events Calendar.) 

Data-Facing Track (first Tuesdays)

Tuesday, February 2, 1p ET/ 12p CT/ 11a MT/ 10a PT

Casual Tuesday Community Roundtable

We want to take this month to hear a bit about what everyone is up to. This session will be a general sharing and free-form brainstorming session. We’d love to hear 3-5 minutes about projects that people are working on currently or new developments. If you’re stuck on something, feel free to bring that forward and get advice from the brain trust. We can also make breakout rooms for deeper discussions that arise.

Researcher-Facing Track (second Thursdays)

Thursday, February 11, 1pm ET (12pm CT/11am MT/10am PT)

Supporting Researchers with Containers

Continue reading “Welcome to February 2021!”

People Network Calls in the New Year

We hope you have a wonderful holiday season and are able to join People Network calls in January! (For handy calendar entries, see the CaRCC Events Calendar.) 

Data-Facing Track (first Tuesdays)

Tuesday, January 5, 1p ET/ 12p CT/ 11a MT/ 10a PT

Python for Big Data

Presenter: Bala Desinghu, Rutgers University

Python is a popular programming language for developing software and data science applications. Its popularity stems from many factors such as simplicity, readability, portability, etc. As such, Python is slow compared to C or Fortran and it does not manage memory well. These limitations in speed and memory management may not be significant when analyzing small data sets, but it becomes a bottleneck when analyzing big data sets. Techniques based on vectorization, parallelization, just in time compilation, and distributed task executions have been widely adopted by the Python community to address these challenges associated with big data. This presentation will address a few techniques suitable for large scale data analysis and answer the following questions: What to do when the data set size exceeds the available physical memory? How to speed up the data analysis? How to distribute the workloads when doing machine learning for big data sets?

Researcher-Facing Track (second Thursdays)

Thursday, January 14, 1p ET/ 12p CT/ 11a MT/ 10a PT

All about CaRCC (… beyond the R-F Track)

Presenters:
Tom Cheatham, University of Utah
Lauren Michael, University of Wisconsin
Dana Brunson, Internet2
Patrick Schmitz, Semper Cogito Consulting

Continue reading “People Network Calls in the New Year”

Join the December People Network Calls, including a Party!

We’ll have just three community calls in December (to avoid conflicts with winter vacation plans), including a cross-network party! (For handy calendar entries, see the CaRCC Events Calendar.) 

CaRCC End-Of-Year Party

Thursday, December 10, 1-2:30pm ET / 12-1:30pm CT / 11am-12:30pm MT / 10-11:30am PT

We’ll use the usual Zoom room, and Zoom’s new support for self-select breakout rooms, with rooms designated for following topics: Main Room: Greetings, Hors D’oeuvres, and Games; SC After-Party; People Network Brainstorm (share your ideas!); All About CaRCC Working and Interest Groups.

Make sure you’ve updated your Zoom client since September 21 (to version 5.3.0 or higher).

Data-Facing Track (first Tuesdays)

Tuesday, December 1, 1p ET/ 12p CT/ 11a MT/ 10a PT

Creating a standard Vocabulary: the Unified Medical Language System (UMLS)

David Anderson from the National Library of Medicine will present on the Unified Medical Language System (UMLS). The UMLS is a set of files and software that brings together many health and biomedical vocabularies and standards to enable interoperability between computer systems. David will provide a basic overview of the UMLS including its history and use cases.

Emerging Centers Track (third Wednesdays)

Wednesday, December 16, 12p ET/ 11a CT/ 10a MT/ 9a PT

NSF Program Directors – Campus Cyberinfrastructure (CC*) NSF 21-528

You will not want to miss this call! The NSF CC* program is a perfect program for emerging centers to begin developing their research partnerships and center resources.

Kevin L. Thompson, NSF CISE/OAC and Deepankar (Deep) Medhi, NSF CISE/CNS, will share information about the recently released CC* Program solicitation and answer any questions you may have.  Jen Schopf, Director of EPOC, will then answer some commonly asked questions that EPOC has heard from previous submitters to the CC* program.  She will briefly explain how the EPOC program experts can assist you in the development of your proposal.   

We encourage previous recipients of the award to share any lessons learned or tips and tricks for developing a successful proposal.

Continue reading “Join the December People Network Calls, including a Party!”

People Network Calls in November; End-of-Year Party in December

Mark your calendars for these upcoming People Network virtual meetings. (For handy calendar entries, try the CaRCC Events Calendar.)

CaRCC End-Of-Year Party (Save the date!)

Thursday, December 10, 1-2:30pm ET / 12-1:30pm CT / 11am-12:30pm MT / 10-11:30am PT

We’ll use Zoom’s new support for self-select breakout rooms, with rooms designated for several topics, including opportunities to learn about and get involved with CaRCC Working Groups and People Network engagement, and a room for an SC20 ‘after-party’ discussion. Make sure you’ve recently updated your Zoom client (to version 5.3.0 or higher) to take full advantage.

Data-Facing Track (first Tuesdays)

Tuesday, November 3, 1p ET/ 12p CT/ 11a MT/ 10a PT

Teaching Data Skills Remotely: Check-in

This month is a check-in discussion about teaching research data skills remotely 6+ months into the effort.  What have you learned about teaching R, Python, SQL, and other data programming languages and skills online?  What resources (technical and people) are required to be successful?  What have you changed from your initial efforts?  How have your experiences influenced your thinking about workshops and training for the future?  What training do you provide other than live workshops?  Any experiments in remote formats?  We’ll have a panel of folks willing to briefly share their experiences, but there will also be an open discussion time to share ideas and resources and ask questions.

Researcher-Facing Track (second Thursdays)

Thursday, November 12, 1p ET/ 12p CT/ 11a MT/ 10a PT

Continue reading “People Network Calls in November; End-of-Year Party in December”

Join Us for October People Network Calls

Mark your calendars for these upcoming People Network virtual meetings. (For handy calendar entries, try the CaRCC Events Calendar.)

Data-Facing Track (first Tuesdays)

Tuesday, October 6, 1p ET/ 12p CT/ 11a MT/ 10a PT

The Power of Electronic Lab Notebooks

Electronic Lab Notebooks (ELNs) are a digital tool to help address the increasing concerns around data. Researchers are concerned about reproducibility and granting agencies are concerned about data management and availability. Labs are also collaborating with people around the world. Paper just doesn’t cut it anymore. With ELNs, researchers can capture notes about their experiments and attach the generated data files directly to them. Then, they can give their collaborators access to the data by sharing the notebook with them. Plus, many notebooks allow for metadata generation and have advanced search capabilities, something that paper notebooks cannot do.

Join us to hear about the general concept of ELNs, some of the popular products on the market today, what supporting them looks like, and a bit about how to obtain one for your university.

Researcher-Facing Track (second Thursdays)

Thursday, October 8, 1p ET/ 12p CT/ 11a MT/ 10a PT

Continue reading “Join Us for October People Network Calls”

Join Us for September People Network Calls

Mark your calendars for these upcoming People Network virtual meetings. (For handy calendar entries, try the CaRCC Events Calendar.)

Data-Facing Track (first Tuesdays)

Tuesday, Sept 1, 1p ET/ 12p CT/ 11a MT/ 10a PT

An Institution-wide Examination of Data Needs: NSF EPOC Deep Dive at the University of Cincinnati

The Engagement and Performance Operations Center: Overview and Opportunities. The Engagement and Performance Operations Center (EPOC), funded by the US National Science Foundation, is a collaborative focal point for operational expertise and analysis jointly lead by Indiana University (IU) and the Energy Sciences Network (ESnet). The Center enables researchers to routinely, reliably, and robustly transfer data through a holistic approach to understanding the full pipeline of data movement to better support collaborative science. Through its measurement and monitoring work, as well as associated service advice and training, it brings together multiple knowledgeable and experienced science engagement teams.

A University’s Point of View. The University of Cincinnati was fortunate to be one of the first EPOC Deep Dive locations.  We will 1) share the process we used to collect the data/case studies from the researchers before the onsite visit from EPOC, 2) the two-day visit with the EPOC team and 3) outcomes, challenges and opportunities for implementation of the recommendations.

Researcher-Facing Track (second Thursdays)

Thursday, September 10, 1p ET/ 12p CT/ 11a MT/ 10a PT

Supporting Researchers in the Cloud: One University’s Approach

Continue reading “Join Us for September People Network Calls”

Capabilities Model Data Submission Deadline Extended to Sept 27, 2020!

Thanks to all the attendees and participants in the Capabilities Model activities at PEARC this year!

Following the CaRCC Town Hall, the Caps Model paper presentation, and the full-day Capabilities Model workshop, we had an overwhelming number of new downloads for the tool. Because of this and the feedback from those institutions already working through the Model, we are extending the data submission deadline to September 27, 2020. We hope the extra time will allow everyone to complete the Model and meet the 2020 community data submission deadline.

To learn more about the Capabilities Model: RCD CM webpage
Start here document: Capabilities Model Introduction and Guide to Use
To request your institution’s copy of the Model: Access Request Form

Upcoming Office Hours:
Tuesday, Aug 25th 11am-1pm ET / 10am-12pm CT / 9-11am MT / 8-10am PT
Thursday, Sept 17th 2-4pm ET / 1-3pm CT / 12-2pm MT / 11am-1pm PT
Tuesday, Sept 22nd 11am-1pm ET / 10am-12pm CT / 9-11am MT / 8-10am PT

For help: capsmodel-help@carcc.org

Original blog post: Announcing the RCD CM 2020 Community Data participation window

Something Different for August – Joint Track Calls!

For August, we’ll do something a little different, with two opportunities for cross-track conversation. (These are in lieu of track-specific calls, which are otherwise canceled for August.)

PEARC20 After-Party: August 4th @ 1:00pm-2:00pm ET 

(at the usual Data-Facing track meeting time)

Be sure to join at the top of the hour to finalize topic-based breakout rooms. We’ll use a Blackboard Collaborate session, which has a feature for breakout rooms that participants can move between, freely. Potential breakout topics suggested by our track coordinators are listed below:

  • Conference Feedback
  • RCD Professionalization
  • New Applications Tech
  • New Systems/Services Tech
  • Communicating about RCD Resources
  • Review of Sessions about CaRCC Activities
  • Happy Hour (whatever that means to you)

The Blackboard Collaborate session and a Google Doc for notes have been shared via email to the entire People Network, as well as a calendar invite. If you did not receive these and would like to join the call, please contact help@carcc.org

Service Models for Researcher-Purchased Computing and Storage: August 20th @ 1:00pm-2:30pm ET

(at the usual Systems-Facing track meeting time)

Description: The term “condo” is an umbrella term for an increasingly common family of service models for research computing and data storage in higher education. However, the way this design pattern manifests can vary greatly from one institution to another, and there’s no single answer for the right way to implement computing and data capacity for-purchase. The purpose of this call will be to discuss approaches to researcher-purchased capacity from a variety of perspectives, including systems professionals, support and facilitation professionals, researchers, and other stakeholders.  Discussion areas will include ownership models, funding and purchase strategies, user experience/policy considerations, hosting and operational support, etc.  Discussion will consist of a panel format with a series of short site introductions by representatives of diverse service models, followed by a longer Q&A.  To accommodate the multiple perspectives and facets, this call will be 90 minutes long.

Zoom coordinates (usual) have been distributed via email to the People Network email list, or can be requested via help@carcc.org.