Talent.com
Fine Tuning / Post Training Data Scientist - RL (GRPO, PPO, RLHF)

Fine Tuning / Post Training Data Scientist - RL (GRPO, PPO, RLHF)

BinanceWorkFromHome, Daerah Istimewa Yogyakarta ꦝꦌꦫꦃꦆꦱ꧀ꦠꦶꦩꦺꦮꦪꦺꦴꦒꦾꦏꦂꦠ, Indonesia
4 hari yang lalu
Uraian Tugas

Binance is a leading global blockchain ecosystem behind the world’s largest cryptocurrency exchange by trading volume and registered users. We are trusted by over 280 million people in 100+ countries for our industry-leading security, user fund transparency, trading engine speed, deep liquidity, and an unmatched portfolio of digital-asset products. Binance offerings range from trading and finance to education, research, payments, institutional services, Web3 features, and more. We leverage the power of digital assets and blockchain to build an inclusive financial ecosystem to advance the freedom of money and improve financial access for people around the world.

About the Role

You will develop and optimize Reinforcement Learning (RL) models for enterprise-scale applications such as customer service, token reporting, compliance, and Web3 domain reasoning.

You will explore and evaluate advanced Algorithms including PPO, GRPO, DPO, RLHF, RLAIF, and Agentic RL to enhance the capabilities of LLMs, VLMs, and Agentic AI at Binance. The role requires a strong theoretical foundation in RL—covering policy optimization, reward modeling, and planning—paired with the Engineering skills to build scalable production systems.

You will take full ownership from research through deployment, driving experimentation with systematic evaluation and benchmarking. Collaboration across research, infrastructure, and application teams will be key to delivering impactful AI solutions.

Responsibilities

  • Research and develop state-of-the-art RL algorithms, focusing on large model optimization and alignment techniques.
  • Design and implement RL training pipelines, including environment simulation, data generation, and reward function design.
  • Apply Reinforcement Learning methods to enhance LLM / VLM / Agentic AI capabilities in reasoning, planning, and autonomous decision‑making.
  • Collaborate with Engineers and researchers to integrate RL solutions into enterprise AI platforms.
  • Monitor model performance in production and continuously improve through iterative training and fine‑tuning.

Requirements

  • Master’s Degree in Computer Science, Applied Mathematics, Machine Learning, or related fields.
  • 5+ years of hands‑on experience in RL and (either 1 : LLM / VLM / Agentic AI) optimization.
  • Strong coding skills in Python, with experience in ML frameworks and RL libraries.
  • Experience with large-scale distributed training and optimization.
  • Self‑driven, ownership mindset, and strong problem‑solving skills. Excellent communication skills for cross‑functional collaboration.
  • Why Binance

  • Shape the future with the world’s leading blockchain ecosystem
  • Collaborate with world-class talent in a user‑centric global organization with a flat structure
  • Tackle unique, fast‑paced projects with autonomy in an innovative environment
  • Thrive in a results‑driven workplace with opportunities for career growth and continuous learning
  • Competitive salary and company benefits
  • Work‑from‑home arrangement (the arrangement may vary depending on the work nature of the business team)
  • Binance is committed to being an equal opportunity employer. We believe that having a diverse workforce is fundamental to our success.

    By submitting a job application, you confirm that you have read and agree to our Candidate Privacy Notice.

    #J-18808-Ljbffr

    Buat peringatan pekerjaan untuk pencarian ini

    Data Scientist • WorkFromHome, Daerah Istimewa Yogyakarta ꦝꦌꦫꦃꦆꦱ꧀ꦠꦶꦩꦺꦮꦪꦺꦴꦒꦾꦏꦂꦠ, Indonesia

    Pekerjaan yang berhubungan
    • Dipromosikan
    Kami sedang mencari guru les privat UX design di Yogyakarta

    Kami sedang mencari guru les privat UX design di Yogyakarta

    SuperprofYogyakarta, ID
    Superprof adalah sarana untuk berbagi ilmu dan pengetahuan yang menghubungkan mereka yang ingin belajar dan mereka yang ingin mengajar. Superprof didirikan pada bulan Agustus 2013 dan diluncurkan di...Menampilkan lebih banyakTerakhir diperbarui: 30+ hari yang lalu
    • Dipromosikan
    Kami sedang mencari guru les privat SEO di Yogyakarta

    Kami sedang mencari guru les privat SEO di Yogyakarta

    SuperprofYogyakarta, ID
    Superprof adalah sarana untuk berbagi ilmu dan pengetahuan yang menghubungkan mereka yang ingin belajar dan mereka yang ingin mengajar. Superprof didirikan pada bulan Agustus 2013 dan diluncurkan di...Menampilkan lebih banyakTerakhir diperbarui: 30+ hari yang lalu
    • Dipromosikan
    Kami sedang mencari guru les privat Python di Yogyakarta

    Kami sedang mencari guru les privat Python di Yogyakarta

    SuperprofYogyakarta, ID
    Superprof adalah sarana untuk berbagi ilmu dan pengetahuan yang menghubungkan mereka yang ingin belajar dan mereka yang ingin mengajar. Superprof didirikan pada bulan Agustus 2013 dan diluncurkan di...Menampilkan lebih banyakTerakhir diperbarui: 30+ hari yang lalu
    • Dipromosikan
    Data Scientist

    Data Scientist

    Mobius DigitalWorkFromHome, Daerah Istimewa Yogyakarta ꦝꦌꦫꦃꦆꦱ꧀ꦠꦶꦩꦺꦮꦪꦺꦴꦒꦾꦏꦂꦠ, Indonesia
    Get AI-powered advice on this job and more exclusive features.We are hiring developers for projects with one of the biggest reputable conglomerates in Indonesia. With us, you will be exposed to many...Menampilkan lebih banyakTerakhir diperbarui: 30+ hari yang lalu
    • Dipromosikan
    Māori Language Specialist - AI Trainer

    Māori Language Specialist - AI Trainer

    Invisible Expert MarketplaceWorkFromHome, Daerah Istimewa Yogyakarta ꦝꦌꦫꦃꦆꦱ꧀ꦠꦶꦩꦺꦮꦪꦺꦴꦒꦾꦏꦂꦠ, Indonesia
    Be among the first 25 applicants.Get AI-powered advice on this job and more exclusive features.Are you an experienced Māori language professional eager to shape the future of AI? Large-scale langua...Menampilkan lebih banyakTerakhir diperbarui: 17 hari yang lalu
    • Dipromosikan
    Mapudungun Language Expert - AI Trainer

    Mapudungun Language Expert - AI Trainer

    Invisible Expert MarketplaceWorkFromHome, Daerah Istimewa Yogyakarta ꦝꦌꦫꦃꦆꦱ꧀ꦠꦶꦩꦺꦮꦪꦺꦴꦒꦾꦏꦂꦠ, Indonesia
    Mapudungun Language Expert – AI Trainer.We need Mapudungun language specialists to help shape the next generation of AI.Large-scale language models can accurately capture Mapudungun vocabulary, gra...Menampilkan lebih banyakTerakhir diperbarui: 17 hari yang lalu
    • Dipromosikan
    Data Scientist / Algorithm Engineer (LLM) – AI Safety

    Data Scientist / Algorithm Engineer (LLM) – AI Safety

    BinanceWorkFromHome, Jawa, Indonesia
    Binance is a leading global blockchain ecosystem behind the world’s largest cryptocurrency exchange by trading volume and registered users. We are trusted by over 280 million people in 100+ countrie...Menampilkan lebih banyakTerakhir diperbarui: 17 hari yang lalu
    • Dipromosikan
    Senior Data Scientist

    Senior Data Scientist

    Manulife Insurance MalaysiaWorkFromHome, Jawa, Indonesia
    Are you looking for a supportive, collaborative workplace with great teams and inspiring leaders? You’ve come to the right place. We’re looking for ambitious people who share our values and want to ...Menampilkan lebih banyakTerakhir diperbarui: 30+ hari yang lalu
    • Dipromosikan
    Principal (Senior) Data Science & Modelling

    Principal (Senior) Data Science & Modelling

    Michael PageWorkFromHome, Jawa, Indonesia
    Drive data innovation in a global payments ecosystem.Lead impactful analytics projects with top-tier clients.Our client is a global leader in digital payments, enabling secure transactions across m...Menampilkan lebih banyakTerakhir diperbarui: 5 hari yang lalu
    • Dipromosikan
    Kami sedang mencari guru les privat Photoshop di Yogyakarta

    Kami sedang mencari guru les privat Photoshop di Yogyakarta

    SuperprofYogyakarta, ID
    Superprof adalah sarana untuk berbagi ilmu dan pengetahuan yang menghubungkan mereka yang ingin belajar dan mereka yang ingin mengajar. Superprof didirikan pada bulan Agustus 2013 dan diluncurkan di...Menampilkan lebih banyakTerakhir diperbarui: 30+ hari yang lalu
    • Dipromosikan
    Kami sedang mencari guru les privat JavaScript di Yogyakarta

    Kami sedang mencari guru les privat JavaScript di Yogyakarta

    SuperprofYogyakarta, ID
    Superprof adalah sarana untuk berbagi ilmu dan pengetahuan yang menghubungkan mereka yang ingin belajar dan mereka yang ingin mengajar. Superprof didirikan pada bulan Agustus 2013 dan diluncurkan di...Menampilkan lebih banyakTerakhir diperbarui: 30+ hari yang lalu
    • Dipromosikan
    Manager, Data Scientist

    Manager, Data Scientist

    Visa Inc.WorkFromHome, Jawa, Indonesia
    Lead, execute and deliver data science engagements for Indonesia clients.Define detailed scope and methodology, design and create solutions, execute on frameworks leveraging appropriate tools and t...Menampilkan lebih banyakTerakhir diperbarui: 30+ hari yang lalu
    • Dipromosikan
    ML Engineer Specialist – AI Trainer

    ML Engineer Specialist – AI Trainer

    Invisible Expert MarketplaceWorkFromHome, Jawa, Indonesia
    Do you enjoy Kaggle-style problem solving, iterating on models, testing ideas, and driving performance gains? Now imagine doing that as paid work, where your solutions directly strengthen the world...Menampilkan lebih banyakTerakhir diperbarui: 19 hari yang lalu
    • Dipromosikan
    Kami sedang mencari guru les privat Digital Marketing di Yogyakarta

    Kami sedang mencari guru les privat Digital Marketing di Yogyakarta

    SuperprofYogyakarta, ID
    Superprof adalah sarana untuk berbagi ilmu dan pengetahuan yang menghubungkan mereka yang ingin belajar dan mereka yang ingin mengajar. Superprof didirikan pada bulan Agustus 2013 dan diluncurkan di...Menampilkan lebih banyakTerakhir diperbarui: 30+ hari yang lalu
    • Dipromosikan
    Nahuatl Language Expert - AI Trainer

    Nahuatl Language Expert - AI Trainer

    Invisible Expert MarketplaceWorkFromHome, Jawa, Indonesia
    Nahuatl Language Expert - AI Trainer.Are you a Nahuatl language expert eager to shape the future of AI? Large-scale language models are evolving from clever chatbots into powerful tools for communi...Menampilkan lebih banyakTerakhir diperbarui: 17 hari yang lalu
    • Dipromosikan
    Principal (Senior) Data Science & Modelling

    Principal (Senior) Data Science & Modelling

    PT Michael Page Internasional IndonesiaWorkFromHome, Jawa, Indonesia
    We are seeking a seasoned data science leader to spearhead advanced analytics initiatives and deliver actionable insights for strategic business decisions. This role combines technical expertise wit...Menampilkan lebih banyakTerakhir diperbarui: 4 hari yang lalu
    • Dipromosikan
    Data Annotator with Indonesian - AI Trainer

    Data Annotator with Indonesian - AI Trainer

    Toloka AnnotatorsWorkFromHome, Daerah Istimewa Yogyakarta ꦝꦌꦫꦃꦆꦱ꧀ꦠꦶꦩꦺꦮꦪꦺꦴꦒꦾꦏꦂꦠ, Indonesia
    Data Annotator with Indonesian - AI Trainer.At Toloka Annotators, we connect smart, curious people from around the world with freelance online tasks that train and improve artificial intelligence.T...Menampilkan lebih banyakTerakhir diperbarui: 30+ hari yang lalu
    • Dipromosikan
    Fine Tuning / Post Training Data Scientist - RL (GRPO, PPO, RLHF)

    Fine Tuning / Post Training Data Scientist - RL (GRPO, PPO, RLHF)

    BinanceWorkFromHome, Jawa, Indonesia
    Binance is a leading global blockchain ecosystem behind the world’s largest cryptocurrency exchange by trading volume and registered users. We are trusted by over 280 million people in 100+ countrie...Menampilkan lebih banyakTerakhir diperbarui: 4 hari yang lalu