Yandex Data Factory, Zavalishina became a data scientist by field practice, starting with building up parent company Yandex's web portal (Yandex is the largest search provider in Russia). Working out of their Amsterdam headquarters, it's Zavalishina's job to make sure Yandex Data Factory delivers business results, not just analytics.
Yandex Data Factory bills itself as "the machine learning and data analytics experts that use data science to improve business’ operations, revenues and profitability." But during our conversation, Zavalishina cautioned against the romanticization of data as a business asset. She also shared details from customer projects, arguing you cannot be data driven unless you are willing to move from dashboards to bold experiments. She also had some interesting views on being a Russian female technologist.
Jon Reed: Jane, you didn't start out in data science - how did you get there?
Jane Zavalishina: I started working at Yandex in the year 2000. The internet was small back then, but it was booming. It was an exciting time. When I came to Yandex, it was a fifty-person company. Now, it's more than 6,000. I just came on as a Project Manager, and very quickly became Chief Product Officer. My main job was to make Yandex the number one portal in Russia, because when I came we were number four. Being the Chief Product Officer, I was actually responsible for everything else but search.
Reed: So how did you get to number one?
Zavalishina: We launched some different web services, like Yandex News and free webmail. We had a big search audience. The problem was, back in 2000, the average internet user would use search two times a week. We had to attract portal traffic in other ways.
Reed: So what happened next?
Zavalishina: I left Yandex in 2003. The job was accomplished, and Yandex became number one. I became bored. Then, at the end of 2005, I rejoined Yandex to become the head of Yandex Money. In 2013, I was asked to lead Yandex Data Factory (Editor's note: Yandex Data Factory launched in December 2014). Arkady Volozh, the founder of Yandex, approached me and said, "We have this new idea. It is kind of a start-up, but it will be a start-up within Yandex," He described the idea of the Yandex Data Factory and I was indeed excited and took it on.
Shifting internal data assets to serve clients
Reed: Does this go back to Yandex's strength in search?
Zavalishina: In a way, yes. Yandex realized that they had some unique data assets, which they mostly use for one purpose: to generate money from the Russian advertising market, which is more than 90% of Yandex's income. These assets are unique and are very much in demand nowadays, because what Yandex does better than many other companies, is, in fact, working with big data, processing that data, and analyzing it. Yandex is very, very strong in machine learning because that's what search is all about.
Reed: So how did Yandex move from search expertise to helping clients?
Zavalishina: The idea for Yandex Data Factory actually came from talking to our partner CERN. We started to talk to CERN physicists about the problems they had with the huge volumes of data they had to analyze, looking for very specific and rare instances. Our team immediately realized CERN's problem is very much like the problem we already quite successfully posed for to ourselves. So we said, "Why don't you take one of our proprietary algorithms we use for Yandex and try and make it work for you, at CERN, to help physicists use it in their work?" They did it - and that was successful.
Reed: And an internal startup was born.
Zavalishina: Yes. That's when Yandex started to see, "Maybe, these assets are not just benefits to what we do in consumer Internet. Maybe they can be applied much wider to different industries." That's how Yandex Data Factory started.
Data is an asset is overrated - deriving data value is underrated
Reed: Correct me if I'm wrong, but I believe you are skeptical about data as an asset. How far can these assets take you?
Zavalishina: Of course, we have this data asset, but for us, at Yandex Data Factory, it's not the biggest thing. To be honest with you, I'm not much of a believer in this idea that data is is such a valuable asset. There are famous quotes from IBM about 90 percent of data being created in the last two years, which means that it's exponential growth, which means that data quickly becomes a commodity.
What's really a valuable asset is your ability to work with that data properly. What Yandex is very good at is working with data at all of the stages, from cleaning raw data to building algorithms. we do have this advantage of having great algorithms and also, a great pool of data scientists within Yandex.
Reed: I guess it helps if you have your own data university...
Zavalishina: Yes, that's the Yandex School of Data Analysis, which provides Master's degrees in Computer Science and Data Science. Of course, people who graduate from the Yandex School of Data Analysis can go anywhere they want after graduation, but most of them end up in Yandex.
Customer field stories
Reed: So tell us about how this works on customer projects.
Zavalishina: We were approached by a steel company. They were thinking about how to use this huge amount of data they've been storing this for the last ten years about the steel production process. In this case, we met with them - combined meetings with business and data science professionals. Usually, with a new customer, we try to find something we can accomplish quickly - usually within ten months. The key criteria is that the expected value from this optimization is big enough, the relevant data exists, and we can experiment with it.
Reed: And what are you trying to accomplish in this case?
Zavalishina: We are working to ensure the necessary process quality but reduce the production cost, which can be done with precise predictive models. It's very complex: for different clients, they have different quality requirements, so you produce different types of steel. Also, at the start of this process, the source materials very in quality also. So you have to analyze all the source material data, and then, in the production process, you need to make the decision about how many additives you add. With the right predictive model, you can reduce the amount of additives and still be fine on quality.
Reed: So where does that project stand now?
Zavalishina: The experiment is now ongoing. What we know already is that the quality of steel they produce is good enough; it complies with the requirements of customers. The real question is how much money we can save during this process. We should have some final results later in December.
Reed: And you can apply this same process to different industries?
Zavalishina: In one of our HR projects, we were predicting which employees were going to leave the company within the next year. This company had thousands of employees, and retention was an issue. If you can use predictive models to determine who is likely to leave, you can do something about it. You can give them additional money or a promotion or send them to a conference, or reward them somehow. It's a lot cheaper than replacing them. We were quite successful in this project, predicting who was going to leave.
Then, we were asked by another company, "If you are building such models, maybe you could help us with optimization of our motivation scheme." It was a retail company - they wanted to make sure their bonus schemes do motivate their employees to sell more and sell better.
If you want to be data-driven, you must experiment
Reed: You've spoken about the need for experimentation if you want these projects to succeed - why is that such an important criteria?
Zavalishina: If a company's not willing to experiment, that's a problem. There is no other way to be data-driven; you need to experiment and measure results. This is probably the biggest challenge for all the companies that are trying to become to data-driven. If you want to be data-driven, you need to reinvent your culture into an experimenting one. You need to understand how to do that in many different parts of your business.
Reed: Before we wrap up, do you have a story on being a woman in technology?
Being a female tech executive: "I'm a lucky one"
Zavalishina: I can comment on that, but I'm not sure it's the comment you are looking for, because what I think is that I'm a lucky one. I can see how often it is a problem, but because I started in this Internet industry very early, it wasn't just being a woman. In fact, I became Chief Product Officer of Yandex when I was 22. Can you believe that?
I was just a girl, but that happened, because of two things, I think. The first one is, in Russia, where it all happened, we have much less difference between men and women in the workplace. The Communists needed all the workforce they could have.
When something so new is quickly developing, as the Internet was in the late 1990s, people need to do great things quickly. That's when people actually look at what you can do, and not how you look. In Yandex, it was always like that.
Image credit: Experiments in laboratory © Sergey Nivens - Fotolia.com
Disclosure: Diginomica has no financial ties to Yandex. I was approached by Yandex PR and pursued the story as I found Zavalishina's views worth covering.