HubofML - Newsletter #8
Predicting Ad Value at Twitter, Scaling Data Platform at DoorDash, Building Background Feature in GoogleMeet, Scaling Live Streaming to Millions of Viewers at Facebook and More
Welcome to another edition of my newsletter, the eighth this year, which I hope will spark new ideas 💡 and provide you with useful information on how tech companies tackle various engineering problems. 💯
If you miss the last one, you can catch up here.
I want to make sure each edition brings something valuable to you; that's why your feedback matters to me. If you have any ideas or requests for future editions, let me know.
I hope you enjoy this month's edition. Please forward it to a friend, colleague, or anyone that comes to mind. 🙏
How Facebook built and deployed a real-time neural text-to-speech system on CPU servers, delivering industry-leading compute efficiency and human-level quality.
A post on how Twitter is using machine learning to predict the value of ad requests.
Here is how DoorDash is able to deliver a reliable data platform that enables optimal business operations, pricing, and logistics as well as improved customer obsession, retention, and acquisition.
Google recently announced ways to blur and replace your background in Google Meet, which uses machine learning (ML) to better highlight participants regardless of their surroundings. Here is a post on how Google engineers built this feature into GoogleMeet.
This is how Airbnb improves the DNN architecture of its search ranking.
Pensieve is an embedding feature platform developed at Linkedin to pre-compute and publish entity embeddings. This post describes how it works and its architecture.
A quick introduction to TFX and how to deploy a ML project to production using TFX, Google AI Platform Pipelines, and Kubeflow.
A comprehensive guide to loss functions in Pytorch.
A comprehensive overview of techniques for structured key-value pair information extraction from invoices. The post reviews research papers that explore data extraction and touch upon how to get started implementing the methods.
Learn Torchserve with examples.
Creating ML models does not give the freedom to write crappy code. Pratik Bhavsar wrote a post on ten best practices for Python Developers.
Pytorch Lightning provides a Python wrapper for PyTorch that lets data scientists and engineers write clean, manageable, and performant training code. Caleb Kaiser wrote a post on how to deploy Python Lightning models to production.
How Facebook built a system capable of managing both UGC (captured on all kinds of devices at differing quality levels) and broadcast-quality high-res streaming — and working reliably for billions of people around the world.
How Uber migrated hundreds of millions customers between two asynchronous accounting systems while maintaining data-consistency with a goal of zero impact on users.
The medium digest contains a list of personalized stories for Medium users. This post describes how Medium scaled the infrastructure responsible for Medium Digest.
CI/CD pipelines allow code to be written quickly and pushed to user-facing applications and services. Though it boosts productivity, it has also caused problems such as site or service outages when bad code, configuration, or AI models are pushed to production. The post introduces how Linkedin is using dark canary clusters to detect problems before they hit production.
In this post, Hammad Khalid, a lead software developer at Shopify, describes some engineering and management mental models that he has found useful over the years.
Continuous integrations and deployments are key elements to a company's success in releasing software actively in the multifold. These are seven tips for creating a successful CI/CD Pipeline.
Timing is a particular sort of luck, so you can simplify this even further down to just luck and work in some ways. One of the most effective ways to get luckier is to be more visible within your organization. Read more on how to achieve visibility in your organization, both internal and external.
We onboard into a new team all the time. Anna Shipman wrote about what she did to break down barriers between her team and made them feel less intimidated about approaching her as a Technical Director of FT.com.
The role of an Engineering Manager can be summed up in people, delivery, and process. In this post, Rodrigo Flores explained what it's expected of every engineering manager.
Efficient onboarding is vital as it helps with employee retention rates, clarifies and sets expectations for the new hire's role, and lowers employee stress. Here is what Medium's onboarding processes look like.
In this post, Pat Kua dismisses a common misconception about being a leader.