Data Engineer

Job description

What do we do?

We gather and process machine learning training data for AI applications internationally and have been providing services for cutting-edge AI businesses as well as Fortune 500 companies. We count Amazon, Sony and Portugal Ventures amongst our investors and are proud to be one of the fastest growing companies in the AI field.

How do we do it?

DefinedCrowd’s culture is about our four core values: Trust, Innovation, Passion, and Creativity. We like to think that we are a multi-talented, quirky and hard-working group dedicated to building a great platform, making our customers and community happy, and making our employees feel at home.

How can you help?

We are currently looking for talented new members across the world to join this energetic, hardworking and fun team in our Seattle headquarter, our R&D centers in Lisbon and Porto, or our office in Tokyo:


  • Ensure quality and integrity of the data provided to end users
  • Be responsible for developing reliable and easy to modify data processing pipelines
  • Awareness of other departments' data needs
  • Adapt and extend our data infrastructure as our operational services grow
  • Build and improve infrastructure for data science and analytics

What do we offer:

  • The opportunity to learn the industry best practices
  • Flexible working conditions
  • Fresh fruit, coffee, snacks and a healthy working environment
  • International and diverse teams (+25 nationalities across 3 countries)

Location: Lisbon or Porto


Required Skills:

  • Background in the Computer Science or Engineering field or related
  • Software development experience in one or more of the following languages: Java, Python, C#
  • Understanding of distributed systems
  • Capable of designing scalable and data-intensive architectures
  • Knowledgeable of the most common trade-offs in data storage and retrieval

    Bonus Points:

    • Experience with large-scale computation frameworks (e.g., Spark)
    • Experience with Kafka and/or RabbitMQ
    • Experience with Data Warehousing
    • Experience with SQL and/or NoSQL databases.
    • DevOps skills: CI/CD pipelines, Bash Scripting, Docker, Kubernetes
    • Experience with training or productizing machine learning models
    • Experience with batch and/or streaming data pipelines