-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Empty group and unexpected results from real pol.is data example #11
Comments
Not sure why group 1 is empty. The "consensus statements" and "divisive statements" seem to contradict the group characteristics. For example, statement 26 "I want my brigade to serve as a partner & champion for traditionally marginalized groups who are otherwise underserved by local government." is strongly agreed by both Group 2 & 3 but shows up in divisive statements. |
Hi @nicobao Great idea to verify it against actual Pol.is data. I didn't know an easy way to do that, but this is excellent.
The vote encoding might need adjustment, especially "pass" values, which could skew the results. So my code is probably not accounting for the large number of "pass" votes Now that you have actual pol.is data for comparison you might also want to check it against the code that Maanas created which is arguably closer to the algorithm that pol.is itself uses: https://github.com/MaanasArora/polis-ctto/blob/master/polis.server/polis/core/routines.py (its probably not very hard to substitute the algorithm for testing it against your sample data). |
I would love to experiment with poli-ctto, but there is no open-source license :/ |
That shouldn't prevent you from testing it in the first place, no need for an open-source license if it doesn't produce the results you want 😉 |
Btw I tried adding Patcon suggested fix, but it doesn't change the result! (I had actually already merged the add_test branch to main, that's what I tested the data against) |
I hadn't tried Patcon's suggested fixes because I started my new job recently. I would have to check again during the weekend how to handle 'pass' votes. |
No problem. I don't have any expectations whatsoever :) I hope you love your new job!! polis-ctto has a license now. That was fast! I'm trying this now! |
After careful testing of both this repo and polis-ctto, I came to the conclusion that the core Pol.is algorithm has so many hidden details built over years of testing that makes the clustering particularly good. For example, they clean the data to prepare for PCA: It seems that all these details are critical to end-up with relevant clusters at scale |
Here is my attempt to combine your repo with polis-ctto (K-MEANS): I tested multiple combinations. https://github.com/zkorum/polislite/blob/with-kmeans/polis_core.py#L35-L36 |
Hi Erik,
I tried running polilite with a real pol.is conversation:
I converted the two .csv to your expected .yaml format
report_from_polis.yaml
.Repro available here: https://github.com/nicobao/polislite-repro
As you can see in result.log, it's strange:
Any idea what went south?
The text was updated successfully, but these errors were encountered: