ok, the image and css downloader is working, and visiting future archival download pages should be lighterweight on the server than visiting any of its regular pages. I have opened my latest downloaded set, that would have hit the server with over 1k image downloads, and it should have barely registered in the radar
(unlike my sequential download of images for the earlier set, that's probably not going to be much use as the avatars seem to have all expired by now. oh well... I guess I should have downloaded them back then...)
nice, but then I wonder what it is that's blocking based on User-Agent... it's pretty clear that this is what's happening, it's entirely reproducible, and the error message is different from the one I get when an IP address is blocked, because the TLS session is actually established, then abruptly aborted, whereas IP blocking prevents it from being established
now, I haven't even been able to make even *regular* accesses for weeks, because of this setting, and the accesses you're seeing in your logs are either normal browser-based access while I tried to figure out what was going on, or *sequential* page preloads from my scripts for offline reading and archiving, neither of which has ever got IPs blocked. but access with a modified user-agent... that amounts to an instant block on the very first connection, and only when accessing gnusocial.net.
now, maybe it's not about GNU in User-Agent, maybe it's libresoc? or something else?
Here's what I was using before, that stopped working a few weeks ago: "Mozilla/5.0 (X11; GNU libresoc64; rv:94.0) Gecko/20100101 Firefox/94.0"
while what works now has 'Linux x86_64" instead of "GNU libresoc64", and the actual, current version numbers.
@administrator, did you by any chance start blocking https requests based on User-Agent a few weeks ago, or tightened preexisting rules?
I had been unable to access with abrowser for all this time, and could only access over text-based browsers over tor after my home IP got blocked. at first I didn't realize I was getting an unusual error; then I thought it was some abrowser upgrade that broke something; then I upgraded another machine that still worked and it kept on working; then I verified all TLS- and abrowser-related files on this machine and found nothing unusual; finally, I realized both browser profiles in which it failed had a modified User-Agent (to have GNU in the operating system name, rather than the misnomer Linux), but the one in which it worked didn't. once I disabled User-Agent overriding, it started working again. so now I configured the browser to lie about the operating system name when contacting gnusocial.net, and I'm back!
if my diagnosis is correct, could your User-Agent blocking rules please tolerate GNU as the operating system name? TIA,
*nod*
I don't oppose blocks in general
they can be a useful tool to protect vulnerable populations
I oppose blocking justified by lies, rumors, or inflated accusations, such as allegations that an instance had to be blocked because a user from another instance sent the instance operator a screencap of a fediblock query of that instance on a web tool allegedly connected to kiwifarms. laughable, I know, but I'm aware of two instances that have got widely blocked over such nonsense.
I trust our instance hasn't fallen for this nonsense, nor been strongarmed into adopting such nefarious blocking.
#CW #shitposting
nada de filtrar CW não!
se alguém falar de CW, eu vou acionar os fediblockers pra desfederar geral!
é que sabe quem mais usava CW?
kiwifarms!
é verdade ese bilete!
(usava? nem sei. tô inventando zoeira pra dar pau na cabeça dos fediblockers :-)
como diria zé simão, rárárá
erhm... I'd love to understand what you mean by that. there are two sources of uncertainty: (i) what I do has changed very significantly over the past week, from (a) using the infinite-scrolling web interface that downloaded an incremental update every 16 posts/threads, which added up to some 50 pages per day; the odds of some of these failing, requiring me to start over and fetch several updates again, or binary-search for the point where I was, was hitting me 2 or 3 times a day, each generating waste of time for me and wasted resources on the server (I'd occasionally load 21 or 42 pages at once after such failures to get back on track; was that ever a problem?); to the failed experiment (b) batch-downloading thousands of posts and merging them into a single web page with a week's worth of posts, that DoSed my browser and the server with requests for avatars and whatnot; and finally to (c) breaking up the downloaded posts into groups of a few hundred, some 20 paginated views every 8h.
with the current arrangement, I batch the page downloads and the avatar downloads when displaying it, which might hit the server harder for a short time, but it should be more efficient in that the avatars and the pages would be downloaded one way or another, but now I'm actually saving server resources by reducing redundant downloads and possibly even improving cache use in multiple layers, by increasing locality and freshness, like batch-processing usually does
perdon, en mis mensaje había un error en el nombre de brewster, pero cuando me dió error, intentaba suscribirme con el butón en la página local de su perfil, entonces ni podría haber cometido el mismo error: https://gnusocial.net/user/270574 https://mastodon.archive.org/@brewsterkahle
recién traté de suscribirme una vez más, y sigue dándome 'unknown error'
hola @administrator
gnusocial.net me da errores cuándo intento suscribirme a:
brewstarkahle@mastodon.archive.org (mastodon, "unknown error" on confirm)
kfogel@rants.org (wordpress activitypub, "could not reach")
¿acaso sabes si tenemos alguna incompatibilidad local, o si necesito contactar a los usuarios u operadores remotos? quisiera reportar a los mantenedores, pero no logro identificar a cuál proyecto contactar :-/ ¿quizás los logs de gnusocial.net podrían ser informativos? muchas gracias,
oops. are you still seeing that? I held some trials during the weekend that may have been pushing things over the limit, but I've settled on something that I think is likely to be more in line with common use
I was more worried about the sequential downloading of pages, because each one takes some 10 seconds of server processing and then a while longer to download, but AFAICT the problem of too many requests came up when I concatenated a few hundred pages (about a week's worth of posts) into a single file and then attempted to open it in the browser. tons of requests for pictures and whatnot all at once. that won't happen again. sorry.
thanks for the suggestion; infinite scrolling is really only part of the issue, and I do use explicit paging to overcome some of the annoyances, but it's not enough
some the problems are arguably bugs in GNU social:
- if fetching the next page fails, the page dies and won't retry the fetching
- if you leave the page alone for a while, next posts appear out of order
- if you reload, you don't get back to where you were, and the longer you wait, the farther away from it you get
- there's no marker of how far I got last time, and threads bubble up to lower-numbered pages as new replies arrive
- finding where I was after a failed fetch, a reload or restart takes a long time as each page takes a while to fetch, and that's unsuitable to automation
local 8h files, no waiting, working reloads => happy camper :-)
Alexandre Oliva (lxo@gnusocial.net)'s status on Monday, 28-Nov-2022 18:29:55 JST
Alexandre Oliva/me is fighting infinite scrolling, high latency and various undesirable consequences thereof by writing scripts to download and combine GNU social's paginated timeline views into local static html. it's been very satisfying, a lot less stressful after wrong clicks, reboots, browser restarts, page reloads/unloads, and for offline reading.
I think I'm going to extend and use them to archive my timeline locally, the same way I archive email. I hope I'm not hammering the server too hard.
infinite scrolling is quite an anti-pattern we've borrowed from surveillance capitalism's attention vampires :-( I'm pretty sure there's an xkcd about this
eventually we get used to opening stuff on other tabs and live on, but I often find myself wishing for better ways to navigate the timeline that are not so vulnerable to a wrong click or a reboot
did you not watch hotmail and gmail happen? and gmail's XMPP? quite different, for sure, but still some reference points worth worrying about. centralization and incompatible changes always lead to trouble, and mastodon's early decision to abandon ostatus federation unilaterally was not a good omen
erhm... I'm not sure what problem you're trying to solve here, and I don't know how other pieces of the puzzle fit in.
one issue is separating CW text from the content proper, and displaying only the CW text unless the content proper is requested. this would require understanding the conventions that other parts of the fediverse use to transfer these separate pieces of text
the other issue is enabling our local users to post messages with separate CW and content, and having them shared with other nodes according to established conventions, so that their nodes or apps display the CWs to them, and only display the content upon request.
ISTM that having the local server identify a CW hashtag and refrain from displaying attached images would not address either issue; I'm not seeing any messages tagged with CW that would be affected by this change.
well... many of the topics people use CW for wouldn't fit the "not safe for work" label, whereas e.g. unionizing is probably not the first topic that would come to mind when people familiar with NSFW see that tag ;-)
thanks. NSFW doesn't encompass the intended uses of CW, though, and we don't get to fill in content warnings, so I think it would be more misleading than useful