It would be nice with a search engine made with decentralization in mind.
Like the internet is decentralized but the good search engines are at centralised companies like Google etc.
Fediverse is like e-mail
Therefore it needs a search engineWhat.
Or exposure to harassment, including offline. Or context collapse. Or…
In the end, adding search would change the space dramatically, especially any privacy-related expectations. And there are about 2mln people who are using fedi with current set of expectations. There are hundreds of thousands who had been using it with this set of expectations for years. Waltzing in and bulldozing these expectations is just not a good idea.
So yeah, don’t do search on fedi unless you do some deep research about consent.
No thanks - that would harm the fediverse by allowing a lot of targeted trolling.
@anders There would need to be a way for that search engine to collect data that is both possible to contribute to as an individual, and doesn’t unintentionally DDoS sites it indexes, and that’s the challenge I think. Spiders collect a LOT of data. Right now the closest thing we have to decentralized search, is metasearch engines like Searx, which query and cache results from all the major search providers that run their own spiders.
The bigger issue is consent. People on fediverse feel very strongly about consent, and search engines tend to just ignore it. Better do some serious research into consent to search on fedi before embarking on designing a search engine for fedi.
@rysiek I don’t think Anders is asking about a search engine for the fediverse, this sounds more like a federated or P2P Google/DuckDuckGo replacement.
Ah I might have misunderstood, sorry.
@rysiek (though if you’d like to argue that search/spidering requires opt-in consent in all cases, I’m happy to hear that argument)
I don’t have to defend my right to decide how stuff I put out there can be used. Whoever wants to scrape my toots has to explain why they want to do so, and get my consent first.
And “well it’s publicly available so it’s fair game” is not enough of an argument. Just as “she was wearing a short skirt” is not consent to sexual advances.
It physically hurts to know that consent it such a controversial topic in tech circles, and it breaks my heart to hear people argue we give consent to invasive data practices just by existing on the internet. I’ve spent my entire life being taught by technology educators that I should expect everything I post online to be publicly accessible forever, and nobody every stopped to ask why.
I am one of those technology educators, and today I would still warn people that “Internet does not forget”, and that they need to be careful what they put out there.
That doesn’t mean we should not demand explanation from people who make it so, and that we should not demand them to ask for consent and respect our refusal to give it. I really appreciate how fedi culturally puts this front-and-center. I hope it continues to do so, and that this way of thinking spreads farther!
I agree that consent should not be a controversial topic. Regardless of how much it inconveniences techbros trying to “disrupt” yet another area of human endeavor.
@rysiek I was not talking precisely about scraping toots, I was asking whether you consider Google, Bing, etc uses of opt-out web spiders to be unethical, but fair enough. (Also, not interested in defending OP given the clarification that he is talking about searching the fediverse.)
I think search engines indexing plain old websites (blogs etc) are an importantly different case.
The nature of the medium in blogs/news websites/etc is way more public and way less intimate (in general…) than social media. Social media blur the line between private and public conversations, for better or worse.
Social media is like having a conversation in a public cafe; websites/blogs is more like publishing a newspaper or standing on the corner of a street shouting your message at strangers.
Making a public archive of newspapers or recording a person shouting at strangers is one thing. Recording semi-private conversations in a cafe is a whole different thing. Does that make sense?
@rysiek yeah, that’s the sort of distinction I was looking for. thanks!
@rysiek @f00fc7c8 The benefit model is different between the two too.
A blog’s author benefits (in some small way) from being indexed, because it helps drive traffic to the content they’re publishing for others to read.
The same can not really be said of toots and it’s relatively rare that it leads to positive contributions to an ongoing conversation.
I had a thought about a week ago about search engines. I think it may be better to move towards a curated-list style of search, instead of an automated indexing one. Think Pinterest, but with a greater emphasis on websites, people, and communities instead of just images.
That way search can be made more human. You can get to know and trust the individuals who curate content, and maybe donate to them if their efforts are valuable to you.
Something like this exists for peertube already, I didn’t look at their code but one might be able to fork it to work with other platforms.
@XpeeN
Yeah I saw that. Great tool.
Is it really that good? I think I tried it some years ago and it wasn’t really a satisfying experience.