HubofML#12: How ML Saves Dropbox $1.7m a Year, Scaling Product Delivery, Measuring Engineering Productivity, and More
Featured posts from Dropbox, Facebook, GoogleAI, Intercom, Twitter, Amazon, and more.
As always, below are my best findings for the last month. I hope this edition sparks ideas in you. If you miss the last one, you can catch up here.
If you enjoy this newsletter, please share it with others who might find it interesting.
Machine Learning/Data Science
How do you build a system that presents so many different types of content in a way that’s personally relevant to billions of people around the world? Just like Dropbox uses ML to predict what file you will work on next, Facebook uses ML to predict which content will matter most to each person on its platform to support an engaging and positive experience. In this post, Facebook’s engineers share details on designing an ML-powered News Feed ranking system.
Each year, organizations lose tens of billions of dollars to online fraud globally. Some of the most common types of online fraud include email account compromise, new account fraud, and non-payment or non-delivery. This post provides an approach to identify high-risk predictions from Amazon Fraud Detector and uses Amazon Augmented AI (Amazon A2I) to set up a human review workflow to trigger a review process for further investigation and validation automatically.
Dropbox already uses ML to power features such as search, file and folder suggestions, and OCR in document scanning. This time, they extended it to document preview, which saves them a ton of $$$. In this post, the Dropbox team shares how they use ML in document previews, saving them $1.7 million a year in infrastructure cost.
Last month, Google announced the open-source release of Model Search, a platform that allows one to develop the best ML models efficiently and automatically. Model Search is domain agnostic, flexible, and capable of finding the appropriate architecture that best fits a given dataset and problem while minimizing coding time, effort, and compute resources. Model Search is built on Tensorflow and can run either on a single machine or in a distributed setting.
Many real-world problems involving networks of transactions, social interactions, and engagements are dynamic and can be modeled as graphs where nodes and edges appear over time. In this post, @emaros96, a researcher at Twitter, and @mmbronstein, head of ML graph at Twitter, describe Temporal Graph Networks, a generic framework for deep learning on dynamic graphs.
From premature optimization to over-engineering solutions for your product, it’s easy to get caught up in making technology decisions that slow you down instead of speeding you up. Startups are about growth and speed, and you should avoid anything that could slow you down as much as possible. That’s why I specifically enjoyed reading this post from @brian_scanlanis, a principal system engineer at Intercom, on ten strategies to avoid when scaling a startup.
If you can’t measure it, you can’t improve it. Anything you can’t measure would be difficult to improve. Measuring developers’ productivity can be difficult but possible, and you should. I came across this insightful post on how to measure engineering productivity.
“The dirty secret of Silicon Valley is that most great product teams follow a system that resembles waterfall (gasp!) to launch new innovative features/products repeatably. The system starts with high conviction based on judgment, intuition, and instinct rather than relying on iterative customer feedback to build conviction over time”. A good read from Matt Greenberg, CTO at reforge, and Andy Johns.
The big difference between leading and managing is simple. When you manage, you have the authority to make decisions unilaterally. But when you lead, you appeal to the self-interest of the people being led. As a manager or individual contributor, you may find yourself in situations where you need to wield influence without the authority from coercive power; such cases will require you to lead.
“Company culture can be defined as a set of shared values, goals, attitudes and practices that characterize an organization.” But how do you measure culture in an engineering context? David Xiang, in this post, describes three standard-issue software engineering activities that can be used as metrics for software engineering culture
In the last edition, I mentioned how slack migrated to Vitess to tackle scale problems. Intercom’s case is not that different. They were facing a database scalability problem and they decided to confront it by transitioning to sharded databases. Here is a post on their journey to sharded databases.
Joel Goldberg retired after working in the software industry for over four decades. This post shares some valuable lessons and principles that have helped him through his career as software engineer.
The good thing about levels is the transparency. When it’s clear to people what’s expected of them, stress goes down. It’s much easier to give frequent feedback with reference. @dafnaros shares details on how they introduced levels to their teams without losing people or hurting productivity.
Thanks for reading! If you like this newsletter and want to support it, please share it with others.