We combined a neural network and decision trees to imitate and even outmatch real people in website’s design evaluation process.
In the future, such a module will evaluate the work of the generative design algorithm, the key element of uKit AI, which will design pages without human involvement, relying on available content and "knowledge" of difference between a non-effective website and the one oriented on a conversion.
The current WebScore AI version reflects the view of the average Internet user about the website appearance. Although, we can create other options, for example, it’s possible to rate the website’ usability.
Websites used to teach the system
First of all, we’ve collected 12,000 sites and stores, created in different years, on various platforms and different languages. The main task was to get enough visual gradation examples, from pretty bad ones to very good websites. With such a ‘concentrated form’ we’ve shown the system what can be met in the modern web.
Each gradation is measured by a scale, and this scale should be understood by an ordinary person whose opinion we try to model. So, we came to the idea of a scale ‘from 1 to 10’, which is used in our service.
People who are imitated by WebScore AI
We needed two things to form a dataset (a set of data for a teaching model) from a variety of websites:
- the signs by which the system will determine if the website is attractive;
- the evaluates (marks) of our scale for a certain website’s mass. They will become a model for the system.
Someone should put these initial evaluates. Such a "teacher", a group of "teachers" if we are to be precise, will greatly affect how the model will work.
To assemble a focus group, we conducted a preliminary selection of candidates on 1500 website examples. A routine work, but a responsible one and demanding great focus. The preliminary selection helped us to narrow down unsuitable candidates and also to exclude the "controversial" (when someone rates it as 1 and the other one with 10) websites from the sample.
At first, we experimented with evaluation methods.
For example, we offered to evaluate one website at a time, then two websites at the same time, or to choose one out of two, the most attractive one. The approach where the respondent saw one single website and evaluated it worked best. We used it to evaluate 10.000 of the remaining websites.
A person evaluated whether a website is beautiful or not. How will the machine do this?
You and I need only one look to form an opinion on the overall beauty of something. But we know that the devil is in the details.
The website visual attractiveness signs, which will guide the model, is a key moment for the whole project. We asked uKit website builder design team for a hand, their work is used as a basis for hundreds of thousands of websites, and millions of people see it. Together we have compiled an expanded list of features that professionals pay attention to when developing a website design. And then tried to cut it, leaving only the most important.
As a result, we get a checklist of 125 quite different, yet significant criteria, grouped into fifteen categories. For example, the list has: adaptation to popular screens, a variety of font sizes, purity of colors, length of headings, the proportion of the images are on the entire page, and so on. What remained, is to train the model with these rules.
Create an algorithm
What is a ‘teaching model’ exactly? It’s the construction of an algorithm that is based on a given set of characteristics and can evaluate the selected website. It is desirable that the system’s evaluation and the average teacher’s evaluation share a minimum gap in the end evaluations.
We’ve decided to use gradient boosting method over the decision trees, because it’s one of the most popular and effective approaches. Using basic algorithms, it constructs an ensemble, the overall result of which exceeds the results of any separate algorithm.
Moreover, with the addition of each following basic algorithm, it tries to improve the quality of the answers of the entire ensemble.
To accelerate and ease the process, we used the CatBoost library from Yandex, which allows building a gradient-based booster in so-called "oblivious decision trees", ensuring good teaching model from "zero" and a quick transition to providing predictions (estimates) for new objects.
Adding a neural network
When the basic algorithm was ready, we decided to conduct an experiment, will the results improve if we add a neural network? Actually, we already knew how to ‘observe’ a website and its design, and now we decided to give the system some kind of a ‘magnifier’ with which it can reveal even more details.
We chose one of the most popular networks, resnet50, it’s known as a good algorithm for extracting high-level features. And we've learned how to get 1000 additional attributes for website evaluation. As a result, the system now characterizes a URL by a total of 1125 signs and finds the website’s ‘place’ on a 10-point scale. The process takes up to tens of seconds, that’s why we consider to speed up the model by reducing the number of signs while maintaining the quality of the evaluation on the same level.
The model trained this way could estimate three times more accurately than the estimates of individual ‘teachers’.
We can say that the model surpassed its first teachers since the focus group estimates differ from the average one more strongly than the estimate of the neural network. Now we put the algorithm into the network for further teaching. And you can become its teacher.