Data Curation & Quality Control Specialist (Satellite Imagery)
GalaxEye Space
Data Science, Quality Assurance
India
Posted on Jun 30, 2025
As part of the ML Data Engineering team, you will be contributing to advancing and managing our satellite imagery data pipeline and infrastructure. You will be collaborating closely with cross-functional teams including the machine learning team, satellite team, and data science specialists. Beyond technical expertise, your role will involve shaping data products, continuous learning, and upholding best practices. Your experience and contributions will be instrumental in driving process improvements and fostering collaboration across the organization.
Responsibilities:
Data Acquisition & Management:
Download satellite imagery from various sources and APIs
Curate and organize satellite imagery datasets according to established standards
Build and maintain comprehensive data catalogs with proper metadata tagging
Ensure data quality and integrity throughout the acquisition process
Infrastructure & Storage:
Optimize storage costs while maintaining data accessibility and performance
Monitor and maintain data backup and recovery systems
Data Labeling & Annotation:
Coordinate and execute satellite imagery labeling initiatives with data scientists
Develop and implement quality control processes for labeled datasets
Work with domain experts to establish labeling guidelines and standards
Manage labeling workflows and delivery to ML teams
MLOps & Pipeline Support:
Support the development and maintenance of automated data pipelines
Implement data versioning and lineage tracking
Collaborate on model training data preparation and validation
Assist in deploying and monitoring data processing workflows
Cross-functional Collaboration:
Work closely with the satellite team to understand data requirements
Support the machine learning team with timely data delivery
Participate in technical discussions and provide data-driven insights
Document processes and maintain technical documentation
Requirements
Who Should Apply
We welcome final-year students, recent graduates, and early-career professionals (0–1 year experience) with a passion for data systems and applied ML. We encourage applicants from any academic background with the right technical skills and willingness to learn.
Technical Skills:
Strong programming knowledge in Python
Strong problem-solving skills and debugging ability
Strong programming skills in Python and familiarity with data processing libraries (pandas, numpy, etc.)
Experience with cloud platforms (AWS, Google Cloud, or Azure) and storage solutions
Understanding of database systems (both SQL and NoSQL)
Familiarity with data pipeline tools and workflow orchestration
Basic knowledge of containerization (Docker) and version control (Git)
(Bonus) Understanding of satellite imagery and geospatial formats
Personal Attributes:
Rapid in response, flexible to changes, and nimble in approach without compromising on the overall quality of work.
Develops solutions through an adequate mix of intuition, reason and logic
Pushes the collective quality of thought to new limits while trusting and respecting others' skillsets & intentions
Communicates with utmost clarity, while maintaining the highest standards of candor
Adopts simplicity as a clutter breaking mechanism to waste fewer resources and time
Strives for perfection, iteratively, to deliver work that is above and beyond accepted standards of excellence
Work Experience
Entry-level position suitable for recent graduates or candidates with 0-1 years of relevant experience in data engineering, data science, or related fields
Benefits
Learning & Growth:
Hands-on experience with cutting-edge satellite imagery and geospatial technologies
Exposure to enterprise-scale data infrastructure and MLOps practices
Mentorship from experienced data engineers and ML practitioners
Opportunity to work on real-world problems with global impact
Technical Exposure:
Work with petabyte-scale satellite imagery datasets
Learn industry best practices for data governance and quality management
Gain experience with modern data stack and infrastructure-as-code
Understand the complete ML lifecycle from data to deployment
Career Development:
Clear path for conversion to full-time role based on performance
Opportunity to shape and influence data strategy and processes
Exposure to cross-functional collaboration in a fast-paced environment
Access to continuous learning resources and professional development