Hello doctor, I have problem with karma
"90-60-90", or travel around the city by policeman
I know a sufficient number of services, which present the concept of "karma". And all of these services think the rating on some vague algorithm. The introduction of even empirical non-linear functions of this index from any kind of phases of the moon and the benevolence of other users leads to a noticeable number of problems. That's about it (and possible solution) I would like to speculate.
I will first outline all the problems that I see, and then try to make wholesale conclusions.
the
Problem 1. Weight multipliers
Karmic indicators, one cannot simply add and subtract, as, for example, this is done on habré. It is quite obvious: compare two users with the same karma (say, a = 50) — Vasya and Petya. At the same washi this quarter turned out a subtraction a minus of fifty-one plus (the ratio of liked ÷ 't like = 50 ÷ 1). And the elder Petit — five hundred 450 pros and cons (that is, the apologists and antagonists are shared roughly equally). It seems to me very clear that karma Vasya would have to be more Petya, despite his youth and recent registration.
But if you just bluntly normalize the karma on the total number of votes, it is not quite what we expected: Bob will be 50 times cooler Petit, whom progulivali 20 times more people, and he managed to remain in positive territory. In addition, it may be stupid evil trolls in the amount of three hundred people who cons all, Petya has long been, and to Bob still did not go?
the
Problem 2. Objective and subjective indicators
Yes, there are purely objective (a cat passed 98% saw the announcement), and purely subjective (Bob went to Pete in the profile and ignored him in karma), but still the vast majority — are somewhere in the middle. For example, the number of added bookmarks — statistically more or less objective indicator, and the number ± for the article — even on a large sample of voters closer to the subjective. In addition, the calculation of subjective preferences may help account for the tendency voting (if someone spends the entire allotted resource on minusovaniya — cons should depreciate, and the pros — on the contrary — to go up, and Vice versa).
the
Problem 3. Obvious and non-obvious indicators
Obvious indicator I believe, for example, the number added a note to bookmarks. This is an important parameter, but it cannot be used directly: someone adds to bookmarks every other note, someone- one a year. So it makes sense to use the parameter, normalized by the tendency adds to the "laying". Something like the opposite of the normalized frequency of adding bookmarks. Then John, which service a week and "lays" for the first time, and Peter, who at the service year and this is his fiftieth book will make the karma the same contribution, and Kohl's (a month on the service, 30 tabs "Syndrome "Total Recall"") is 7 times smaller.
It is not obvious it makes sense to include, say, the "atmosphere" in the comments (the total sum of pros/cons). For example, a note in the comments to which there is no minus (or their number is negligible) is probably more beneficial than holigarna. The total number of comments, at the same time, I think, as the parameter can not act in any way.
the
Problem 4. Karma vs. The rating
There are two orthogonal figure: karma and rating. Karma is a feature of subjective indicators is not obvious; rating — on the contrary — the objective function is obvious. It seems to me obvious that the first is only fit for a championship length measurement detonating body, then how to determine the "power" user can use only the objective obvious indicators.
the
Entity
I was able to come up with the following fact, which can be estimated:
the
-
the
- entry the
- review the
- action (± and similar) the
- shared activity
All subjective assessments should be normalized both on the user (the value of the rating is inversely proportional to the balance of such assessments from the user) and their total number.
Now let's try it all brought together.
the
posts and comments
Entries and comments can be evaluated and / or bookmarked, in addition they can be referenced from both outside and from within Habra. Weight review (and related assessments) is less than that of the recording.
the
Actions and other activity
These two parameters can only be measured.
the
The Summing Up
Actually, the silver bullets I have. The coefficients need empirical fitting. But I believe the calculation formula of both mentioned parameters should look something like this:
Karma = NORMALall ( ∑ RULESthis (±karma) )
Rating = RULESall ( ∑ RULESthis (±topics) ) + ⅓ × RULESall ( ∑ RULESthis (±comments) ) + F ( the number of external links ) + ⅓ × F ( the number of internal links ) + ∑ RULESthis (tabs)
The weights of each member of this formula need to be manually adjusted.
Constructive and destructive criticism is very welcome. I am sure that I missed something important and hope to Supplement the methodology with your help. Unfortunately, I have no way to go through all the posts on habré and to count the numbers manually (and if you have direct access to the database is a pair of queries). If we could persuade the administration — no! not to change the existing system! but to calculate these two values for all users, it would be cool. Well, I think so.
Комментарии
Отправить комментарий