Data Scientist – Natural Language Processing

Job description

What do we do?

We gather and process machine learning training data for AI applications internationally and have been providing services for cutting-edge AI businesses as well as Fortune 500 companies. We count Amazon, Sony and Portugal Ventures amongst our investors and are proud to be one of the fastest growing companies in the AI field.

How do we do it?

DefinedCrowd’s culture is about our four core values: Trust, Innovation, Passion, and Creativity. We like to think that we are a multi-talented, quirky and hard-working group dedicated to building a great platform, making our customers and community happy, and making our employees feel at home.

How can you help?

We are currently looking for talented new members across the world to join this energetic, hardworking and fun team in our Seattle headquarter, our R&D centers in Lisbon and Porto, or our office in Tokyo:


  • Apply your knowledge in Natural Language Processing and Machine Learning to research, solve and productize hard problems in a crowdsourcing context
  • Research, design, implement and optimize machine learning services on the platform
  • Participate in integration strategies and application infrastructure required to successfully implement a complete solution and monitor its performance
  • Work with other machine learning team members in designing, testing and implementing cutting edge machine learning models
  • Carry out structured, concise and reproducible data analysis that generate insight on different types of data
  • Conduct state-of-the-art research within a clear path to production

What do we offer:

  • The opportunity to learn the industry best practices
  • Flexible working conditions
  • International and diverse teams
  • Fresh fruit and a healthy working environment.

Location: Lisbon or Porto


Required Skills:

  • MSc/PhD in Computer Science or equivalent
  • Broad knowledge of supervised ML (both classic and deep learning approaches)
  • Knowledge of Statistics, Data Analysis, and autonomy while exploring different kinds of data and formulating novel solutions for problems
  • Knowledge of Deep Learning ML frameworks (e.g. Keras, TensorFlow, etc.)
  • Sound experience with the Python ecosystem for scientific computing (numpy, pandas, scikit-learn, matplotlib, etc.)
  • Critical thinking skills proven by previous research projects, and clear communication
  • Excellent speaking and writing skills in English
  • Proven experience in one or more common tasks in Natural Language Processing (e.g. Language Modeling, Conversational Agents, Named Entity Tagging, Information Retrieval, Topic Modelling, Sentiment Analysis, etc.)

Nice to have:

  • PhD in Computer Science or equivalent
  • Relevant publications in the field of Natural Language Processing
  • Enterprise experience in Natural Language Processing
  • Experience with crowdsourcing data labelling for machine learning problems
  • Previous experience with Big Data technologies (e.g. Hadoop, Spark)