Home | How To Build Next Big Search Engine

Mini Rank Formula

September 3, 2010

Let me introduce a special MiniRank (MR), similar to Google's PageRank (PR), and a simplified formula for calculating its value.

The idea is that if there are two results with the same relevancy rank (a score calculated by any used search algorithm), rank of the item with better MR will be higher.
In other words, importance (weight) of MiniRank should always prevail.

Measuring Lightness
Although the full MR formula should be a secret (like Google's one), I would mention at least three main components in its current version:

1) Size
2) Design
3) Content

The Size (size-to-noise ratio) is measured as arithmetic mean of several variables:

a) Relative content:
"Size of All Visual Text" to "Total Files' size" ratio.

It sounds weird if your huge webpage has just few lines of text, isn't? Is it useful for your visitors? They came to you for your content, not your design files. So, respect them. Give them what they want. And they will keep returning to you again and again.

b) Visual content on page:
"Size of Useful Text" to "Size of All Visual Text" ratio.

The All Visual Text may include such things as ADs, menuses, a box of related links, and ... your logos and copyright messages. Although you have the right to include into your page whatever you want, it's more than likely your visitors will skip everything that is not directly related to main content. So, the less distractions from main content, the better!

In addition, the main formula is also inversely proportional to the absolute "Total files' size". I think it's quite fair: The bigger sizes of used files are, the lower MR should be.

So far was so good. A little less bytes of files results in a lot more readability. Among pages with the same rank for Design and Content, the MR formula prefers SMALLER ones, which would be better for users!

Now let's review the rest of components in the formula.
While the first part is objective, the other ones are somewhat SUBJECTIVE.
If this sounds disappointing, I would remind you that Google's formula is also subjective. Well, at least partly.

Since it was openly published as a dissertation of Google's founders (Sergey Brin) there has been a never-ending process of its updating and refining (some even call that direct manipulating). Can anyone tell me what the formula is today? Nope. Nobody knows, because it's the biggest secret in the world ...

OK, the PR algorithm works great, but ...
Do you know that Google's formula uses over 100(!) correction factors (they call them "signals"), including PR, when calculating a rank for pages?

Try to recall any physical law you studied in school, e.g., the Ohm's Law I=V/R, formula for which additionally has 100+ correction factors! What kind of formula is that? What kind of law is that?

In the case of Google, it's OK.
I don't know what kind of input Google uses (who does?). But the fact of having such a big and uncertain number of correction variables, constantly and secretly changed manually, greatly increases chances the search results are simply distorted in one or another direction.

Well, I'm usually very tolerant as for people's beliefs and ideas they put in the core of their business. But here we are talking about a very big business, which affects a very big number of other peoples' businesses. The Google's PR algorithm defines a level of visibility for each(!) webpage. In fact, Google's algorithm controls access to the whole Web! In a way, it has as much importance for society, as the fundamental Ohm's Law has for all related sciences.

What I'm trying to say is that while ANY formula may be objective by itself, entering some subjective input for its variables makes the whole thing subjective as well!
But this is true for ALL search engines on the market.

Now, back to the Mini-WWW's MR formula.
Its "Design" and "Content" ingredients aim to measure quality, and thus, are subjective by definition.

-- To Be Continued --


COMMENT

About Blog Submit New!
Copyright(C) Beloy ::: Mini-News