Behind the Data: Understanding RDM Roles
Who Does What? Data Steward vs Data Engineer vs Data Scientist
As our data landscape grows, you may hear terms like Data Steward, Data Engineer, and Data Scientist. These roles sound similar, but they focus on very different parts of the data lifecycle. Here’s a simple way to understand them, using research examples.

Infographic created with ChatGPT
Data Stewardship
Data stewardship is a broad role that encompasses a wide range of responsibilities. However, a common goal across all data steward positions is ensuring that data meets the FAIR principles; that data is Findable, Accessible, Interoperable, and Reusable. For more information about FAIR data, take a look at the previous Newsletter on metadata.
Typical tasks may include curating and annotating data with appropriate metadata, developing standards and documentation practices, and providing training or workshops to researchers. These activities help researchers use tools and workflows that support effective Research Data Management (RDM) and improve the long-term usability and impact of their data.
They help ensure that:
- Data is properly annotated with metadata (e.g., strain, media, temperature, protocol)
- Naming conventions are consistent
- Required fields are completed
- Data is stored in the appropriate system
- Documentation is available for future reuse

Image from Unsplash
Data Engineer
Data engineers are software developers whose focus is on designing, building, or maintaining data infrastructure that are responsible for storing, moving, and processing data.
In a research setting, this might mean:
- Automatically transferring sequencing outputs into structured storage
- Building pipelines that process raw imaging files
- Maintaining compute environments for large-scale analysis
Data Scientist
A Data Scientist focuses on analyzing data to answer scientific questions.
A data scientist might:
- Perform statistical analysis
- Build models
- Identify patterns across datasets
- Create visualizations
How These Roles Complement Research
- The Data Engineer ensures data can be reliably stored and accessed.
- The Data Steward ensures data is understandable, structured, and reusable.
- The Data Scientist uses that data to generate knowledge.The Data Engineer ensures data can be reliably stored and accessed.

Image taken from Unsplash