Dieren die helaas niet bestaan

By pim on Saturday 26 July 2014 21:57 - Comments (23)
Category: -, Views: 7.716


Rabobank promoot klikken op vreemde links in emails

By pim on Monday 24 March 2014 18:53 - Comments (18)
Category: -, Views: 5.699

Ik kreeg zojuist het volgende emailtje van de Rabobank:
Veilig bankieren

Phishing, skimmen, [..] Weet u wat het allemaal betekent? Alle termen hebben te maken met digitale criminaliteit.


Enkele tips om veilig online uw bankzaken te regelen
  • Ga altijd op uw vertrouwde manier naar Rabobank.nl en nooit door te klikken op een link in een e-mail.

Herkent u internetcriminaliteit? En weet u hoe u zich er tegen kunt wapenen? Op veiligbankieren.nl leert u er alles over.

Het emailtje over veiligheid verteld me dus dat ik niet op vreemde links in een email moet klikken om bij de rabobank uit te komen, om me vervolgens een vreemde link te geven naar een website geaffilieerd aan de Rabobank. Suspicious, waarom staat deze informatie niet op rabobank.nl/veilig-bankieren/ ?

Ik klik op de link, om vervolgens deze pagina te zien :X


Dit lijkt op een landingspage van een spam-email waar mensen je proberen een boek te verkopen om snel rijk te worden of af te vallen.
Het ziet er zo amateuristisch uit, en nu wil de rabobank niet eens dat ik tips lees, maar dat ik een boek ga downloaden over veiligheid. Volgens mij zijn deze claims nep?
"5 sterren", "gevat en leerzaam", "bestseller" |:( Ze proberen voor de grap op een phising website te lijken, wat is dat voor vreemde humor. Ik heb geen tijd voor humor, daar heb ik andere websites voor.
De auteur lijkt niet op een rabobank medewerker, maar op iemand van een schildersbedrijf?

Ik volg het advies van veiligbankieren.nl: "KLIK WEG".

Bedankt Apple voor de ongevraagde spam

By pim on Wednesday 03 July 2013 12:51 - Comments (12)
Category: -, Views: 7.280

Even klagen want daar ben ik goed in.

Toen ik 2 weken geleden een Iphone kocht, vroegen ze in de Apple store of ik mijn receipt in mijn email wilde ontvangen.
Prima, email adres gegeven voor die reden. En nu begin ik spam van Apple te ontvangen van News_Europe@insideapple.apple.com.

Ja er zit een unsubcribe link in, maar mocht een Apple medewerker dit lezen. Dit zijn de regels in Nederland voor het versturen van commerciele emails:

- voorafgaande toestemming: de ontvanger moet toestemming hebben gegeven

Edit: Apple mag spam versturen, omdat ik een klant ben:
Mag ik aan mijn klanten berichten versturen zonder toestemming? -

Aan bestaande klanten (waaraan een product of dienst is verkocht) kunnen over soortgelijke diensten/producten ongevraagde berichten worden gestuurd zonder dat daarvoor toestemming is verkregen. Wanneer het e-mailadres of mobiele telefoonnummer worden verkregen, moet de klant wel de mogelijkheid krijgen om het ontvangen van ongevraagde berichten te weigeren.


GoDaddy, F*ck you!

By pim on Monday 15 April 2013 23:10 - Comments (38)
Category: -, Views: 10.762

Yesterday I signed up with Godaddy to register a domain name.
After I paid & registered the domain, I configured everything, and 2 hours later my website(that I made earlier) was live.
Today i've got an email that my account is blocked(?!), and that they will cancel it, if I wouldn't sent my ID, and provide a brief explanation on how I intend to use my domain. Right now my account is BLOCKED.

WTF?! It's none of your f*cking business what I'm going to do with my domains. If I'm violating a policy, email me.. But asking me on forehand what my plans with a domain name are?
If this would be for hosting for example, I don't care. I'd just switch to another one. But a domain name is unique, and therefore irreplaceable. And threatening to cancel an account, while MY domain is already in it?

Even if my domain name would be "download-illegal-movies.com", then I'll take responsibility for that, I don't need a registrar be the judge.
And if you find my registration for whatever reason suspicious, tell me during registration, instead of after you received my money.

My brief explanation was by the way that it's none of their business. (IF my domain was something weird or suspicious I'd be happy to give them an explanation).
Wonder if they'll cancel my account. Could even find it funny, Godaddy canceling an account because I'm not telling them how I'm going to use it.
Thank you for being our valued customer. Your account has been selected by our
verification office as a precautionary measure to defend you from possible misuse
of either your payment method or products in shopper account *************.

During the login process, our secure site will prompt you to upload a viewable,
scanned copy of the payment method account holder's government-issued photo
identification, such as a driver's license or passport. In the comments box,
we ask that you also provide a brief explanation of how you intend to use the
product(s) purchased. If we do not receive the requested documentation within
the next 48 hours, your order(s) may be cancelled for your protection.

Verification Office

[Wiktionary] Importing wiki XML dumps into MySQL

By pim on Monday 25 March 2013 08:36 - Comments are closed
Category: overig, Views: 4.262

I was trying to import an XML dump of Wiktionary into MySQL. It all worked well until I tried some non latin languages like Arabic, Chinese etc. The character encoding was messed up.

Just as a reminder for myself, and maybe useful to other people who are struggling with importing wikipedia xml dumps as utf-8 into mysql. These are the steps I took with a WAMP(Windows, Apache, PHP, MySQL) configuration.

For example for importing the Dutch Wikipedia:
1. Create a database, in my case I created `wiktionarynl` (nl = language code for Dutch)
2. Create these tables: http://svn.wikimedia.org/...evision=87357&view=co

Run the following queries to convert/make sure they're utf8_bin:
ALTER TABLE `page` CHANGE `page_title` `page_title` VARCHAR( 255 ) CHARACTER SET utf8 COLLATE utf8_bin NOT NULL;

3. Drop the Unique page_title index from the `page` table (because otherwise you can get 'duplicate entry' errors when importing the data)

4. Download the Dutch XML dump. Go to http://dumps.wikimedia.org/backup-index.html and search for the link 'nlwiktionary'. Click it, and on that page search for:
pages-articles.xml.bz2. Download it and extract the xml file.

5. Download & Open de wikixml importer mwdumper.jar

6. Run mwdumper.jar, select the Dutch .xml, set database to wiktionarynl and start your import.

Now you should have a database where the tables page, text and revision contain data.

7. The titles of pages are in the table page. where the article for the page is in the table 'text'. The table 'revision' keeps track of what's the latest article in the 'text' table.
To get the text that belongs to a record in the page table, do the following:

//Get a page title, then the latest text for that page.
$q = "SELECT * FROM page WHERE page_title = 'Amsterdam'";
$results_page = mysql_query($q);
$row_page = mysql_fetch_array($results_page);

// Then, page_latest is used to search the revision table for rev_id, and rev_text_id is obtained in the process.
$q = "SELECT * FROM  `revision` WHERE  `rev_id` = ".$row_page['page_latest']." limit 1";
$results_revision = mysql_query($q);
$row_revision = mysql_fetch_array($results_revision);

// The value obtained for rev_text_id is used to search for old_id in the text table to retrieve the text.
$q = "SELECT * FROM  `text` WHERE  `old_id` = ".$row_revision['rev_text_id']." limit 1";
$results_text = mysql_query($q);
$row_text = mysql_fetch_array($results_text);

echo '<li>'.$row_page['page_title'];
echo '<li>'.$row_text['old_text'];

With this script you could get thousands to millions of page titles + the corresponding raw wikimarkup data.
The next step would probably be to convert the raw wiki markup to plain text or html.
I tried several converters, but all of them seem to be buggy.. Even if they seem to work good for the English and Dutch wiki, they won't work well for other languages like Vietnamese etc.
However wikipedia itself is showing the data without bugs in 100 different languages, so somewhere in their code the conversion is done properly.

I parsed the data for 20 different languages, and ended up with a lot of corrupted articles that still contained wiki markup.
For most articles I only needed the first paragraph, but many articles start with the 'infobox' that you often find on the right.

A solution to get the correct data I used wikipedia's API:
For example the 'nl' page for 'Amsterdam':

Get the html text out of te Json data:
$json = json_decode($jsondata);
$content = $json->{'parse'}->{'text'}->{'*'};

And then the regular expression <p>(.*?)</p> is enough to extract the first paragraph without any bugs. They don't like it though if you scrape their whole website, that's what the wiki dumps are for.