Note: This post now archived and as such no longer works
This is possible because Lemmy doesn’t proxy external images but instead loads them directly. While not all that bad, this could be used for Spy pixels by nefarious posters and commenters.
Note, that the only thing that I willingly log is the “hit count” visible in the image, and I have no intention to misuse the data.
Nice example!
I think proxying everything through lemmy would have a pretty big bandwidth/scalability impact. I expect the lemmy clients dont send any unique user info on these image requests so not sure how useful it would be as a spy pixel? Maybe I’m missing something :-)
It would be interesting to see just how much info is shared when lemmy requests the image. If there is [potentially] sensitive info being shared, the devs might be interested in working on it too (I have no idea how to check such a thing, this comment is just so I can find the post later when more people have shared their wisdom on it)
Notably, this allows remote parties to associate your IP address with your interests, as revealed by the Lemmy communities that you browse.
One way is for the image host to use the HTTP Referer field. (Standards-respecting web browsers pass the URL of the web page being viewed to the server hosting the image.)
Another way is by posting an image with a unique URL.
Even if Referer is withheld and the image is not unique, the image host can still do basic fingerprinting of your client’s request header and your OS’s TCP quirks, and associate that fingerprint with your IP address.
An option for Lemmy to proxy media would be very helpful. Small instances could perhaps disable it, although they might not need to, since the additional load would scale with the number of users on that instance.
Were you expecting otherwise? Loading an external image is no different than loading an external website with images. Lemmy and reddit are link aggregators, not proxies. Having to proxy everything would run a significant bandwidth for instance admin who are often paying out of pocket for hosting.
How do you get an image to run code? I guess I somehow missed something important in website development.
Edit: I saw that you said you’re using Pillow to actually render the image from code. That’s neat! …and scary
deleted by creator
Share source code? I’m curious
It’s just a simple Flask server. I parse the user-agent using the
user_agents
Python library, apply some conditionals upon the result, render the image using Pillow and send it to the user.
Oh neat, Jerboa doesn’t identify itself. Cool.
I get “unknown (mobile?) client” using Jerboa
Same on Sync (You are viewing this from an unknown (mobile?) client)
And on infinity (You are viewing this from Android)
I’m fine with this. Instances shouldn’t proxy or cache images because it opens instance owners to a lot more liability than text. A client side setting to not load images in comments by default is better.
What is it supposed to say?
What is it supposed to say?
“You are viewing this from The Black Pearl, Davy Jones.”
Easiest way to stop this from happening is to use ublock origin to block all third party request on your instance.
One way to do this is via dynamic filtering. This is for advanced users so be sure to read the info page: https://github.com/gorhill/uBlock/wiki/Dynamic-filtering
(Consider backing up your ublock settings before doing this)
If you are using lemmy.ml your rule would be this:
lemmy.ml * 3p block
if you’re using another instance then change the domain or use both rules cause you might end up visiting the others as well. Note that adding this rule wont work unless enable advanced features in ublock origin.
EDIT: THIS MIGHT BREAK THINGS ON YOUR INSTANCE, its recommended to learn how to use dynamic filtering to unbreak it: https://github.com/gorhill/uBlock/wiki/Dynamic-filtering:-quick-guide If it breaks stuff just remove that rule.
You could also block it using static filters but I can’t remember how to do that exactly, if you know please reply below.
Yeah, I’m using Mullvad with misc DNS blockers enabled so it has nothing on me ᕕ( ᐛ )ᕗ
VPN using Librewolf user checking in. This post got nothing on me.
Man, I remember I scared the crap out of trolls on Reddit when we started arguing over DM, and I added a link to a meme that tracked their IP and system info (without them knowing ofc). Let’s just say they went AFK quickly after that. Good times!
unknown device?
The user-agent detection definitely isn’t great. If it doesn’t recognize a client, it just says unknown. But that wasn’t the main point of the post anyway, this was just meant as a quick proof of concept for anyone curious.
Whats the point of unknown?
unkown
Oh, how did I not notice that before? Now should be fixed.
Still says unkown for me.
[This comment has been deleted by an automated system]
Location is right, but I highly doubt anyone near me is using Lemmy (dictatorship here).
[This comment has been deleted by an automated system]
deleted by creator
Thought about adding the user’s location, but was worried PythonAnywhere could somehow cache the image between multiple people. A great demo though!
Thanks for the heads-up.
Routing my Lemmy mobile app through orbot from now on. Seems to have fixed the issue.
I hate this so much. Its super cool but MAN what the hell. I don’t think I’m going to ever turn off my VPN anymore. I’m in a super small town and that image is correct.
It’s cached somewhere because I can’t get it to update. Maybe time for a new account too. Hmmmm
[This comment has been deleted by an automated system]
Yeah, app cache had to be cleared. We good
Finally. Someone noticed 🥹
You have the code for this? Very interested in how you implemented it
deleted by creator
Damn, PHP is such a sleeper of a language, I always forget how useful it can be.Thanks for sharing!
[This comment has been deleted by an automated system]
I’m not using a VPN or anything and it got my location wrong by 700 kilometers 🤔
Are you sure you are where you think you are? When’s the last time you looked outside?
Hey. I wanted to do this tomorrow.
Well I have a new idea which is pretty similar
[This comment has been deleted by an automated system]
I’m plannig to make one of these “dox’d memes” where someone says something controversial and another one answers with the ip address.
[This comment has been deleted by an automated system]
Woah this is really cool. Though I was way off for me and I’m not on a VPN right now.
[This comment has been deleted by an automated system]
My location is accurate, to give some good feedback on your program too lol
[This comment has been deleted by an automated system]
Joke’s on you. IP geolocation where I am is an unreliable mess and your image got it wrong by about 1000km!
[This comment has been deleted by an automated system]
You can run Geolocation with images now? What the heck? How?
[This comment has been deleted by an automated system]
All these people correcting the result effectively giving useful data to improve data collection and detection methods.
So what is happening if I don’t see an image?
it is because the website providing the image is overloaded and cannot create an image.
You just have to reload the image and eventually you will see one.
Lemmy clients should really include an option to group or only show the first instance of a link for cases like this; where the same link is posted to multiple places.
- Mlem - knows exactly that it’s Mlem.
- Memmy - sees Mobile Safari webkit.
- Voyager - same as Memmy.
- Thunder - just sees Mobile Client.
- Jerboa - also just sees a Mobile Client
Voyager on Android
Doesn’t know it’s sync.