Data Engineer
Base Operations
Base Operations - Data Engineer
Location: US-based, Remote
Who We Are:
Base Operations decodes the world’s threat landscape into actionable security insights that enable organizations to protect their people, assets, and operations. We empower security teams to better assess threats, manage risk across large footprints, and make data-driven decisions using granular, street-level intelligence.
As a category-creator in a growing market, we are looking for impact-driven individuals who are passionate about technology and are committed to helping our customers operate with efficiency and greater accuracy. We developed our technology at Harvard and MIT and use cutting-edge AI. The company is award-winning and backed by top-tier VCs. Our customers include some of the most sophisticated and relevant companies in the Fortune 500. Our team cares about solving hard problems, promoting transparency, encouraging growth in emerging markets, and empowering people to safely explore the world. We are a thoughtful, dedicated, and fun-loving group obsessed with disrupting the security intelligence industry and winning the loyalty of our customers.
Overview:
Base Operations is building the world’s largest dataset of global threat patterns and street-level intelligence. We are looking for an exceptional data engineer that can define, build, and mature the data pipelines, models, and warehouse to help us excel in this objective. This individual should have the technical acumen to contribute thought leadership relative to our architecture strategy, while also excelling at mapping that strategy into a tactical plan and executing against it.
RESPONSIBILITIES
You will:
Build, test, and maintain robust ingestion and transformation pipelines, incl. the injection of NLP models to extract and augment data from free text sources.
Develop data transformation, validation and analysis methods to augment the utility and actionability of data
Build out a data warehousing strategy to realize and enhance the full data lifecycle.
Operationalize data quality throughout, ensuring visibility and timely remediation of data quality issues.
Contribute to GIS data architecture strategies, and drive transformation and implementation activities resulting from those strategies.
Required Experience
3+ years experience building production-level data pipelines and standing up the platforms to support them
3+ experience implementing and maintaining data warehouses, data models, and performing schema migrations. Experience standing up a data warehouse from scratch as well as AWS tools such as Athena and S3 a big plus.
Strong working knowledge of SQL, data models, and performing schema migrations.
Demonstrable thought leadership on operationalizing data quality
Development experience delivering production ready code in Python. Familiarity with Pandas framework.
Strong collaborator across functional areas, incl. data science, infrastructure, software development, and product
Preferred Experience
Hands on knowledge of GIS enabled data stores such as Postgres/GIS, GIS-related libraries such as GeoPandas, and analytic data platforms such as DataBricks
Experience with infrastructure and automation tools such as Jenkins, Argo Workflows, Argo CD, Terraform, Kubernetes or equivalent.
Deep understanding of ML and AI based infrastructure
Familiarity with JavaScript.
In compliance with federal law, all persons hired will be required to verify identity and eligibility to work in the United States and to complete the required employment eligibility verification document form upon hire.
Base Operations champions diversity - we welcome and employ people regardless of race, color, ancestry, religion, gender, gender identity, genetic information, parental or pregnancy status, national origin, sexual orientation, age, citizenship, marital status, disability, or Veteran status. We are proud to be an equal opportunity employer.