Page 1 of 1

API Key Request for sselph/scraper

Posted: Sat Jul 14, 2018 11:44 pm
by sselph
Hi,

I'm the maintainer of https://github.com/sselph/scraper. It is used to generate metadata xml files for EmulationStation and is used by users of RetroPie primarily. It works by hashing a rom and comparing it to a csv I maintain to get the ID in your database to make GetGame requests. I would like to migrate this to the new API and need an API key to get started.

Again thanks for maintaining a great database and if you ever want a mapping of No-Intro hash information to IDs in your database, let me know.

Re: API Key Request for sselph/scraper

Posted: Sun Jul 15, 2018 3:31 pm
by Zer0xFF
Hi sselph,

we currently have api keys on hold due to pending changes and we hope to start re-issuing keys soon.
sselph wrote:
Sat Jul 14, 2018 11:44 pm
Again thanks for maintaining a great database and if you ever want a mapping of No-Intro hash information to IDs in your database, let me know.
this was something that was suggested to us as we rewrote the site, however I'm not completely aware of the status of ROM dumps Vs hash out there, but I assume since you're using them, they're as reliable enough? if so, it might be worth while adding them to the site, Thanks

Regards
Zer0xFF

Re: API Key Request for sselph/scraper

Posted: Sun Jul 15, 2018 6:46 pm
by sselph
The main issue with rom hashes are there are multiple file formats, headers, etc for many systems (NES, SNES, N64, Megadrive) So most of my code is taking the file the user has and then attempting to convert those to something that can be properly hashed.

Re: API Key Request for sselph/scraper

Posted: Sun Jul 22, 2018 8:50 pm
by Zer0xFF
Hi there,

You've been granted an API key which you can access through this page.

please check the Announcement forum on regular bases to ensure compliance with any changes or updates.

sselph wrote:
Sun Jul 15, 2018 6:46 pm
The main issue with rom hashes are there are multiple file formats, headers, etc for many systems (NES, SNES, N64, Megadrive) So most of my code is taking the file the user has and then attempting to convert those to something that can be properly hashed.
I assume it's a reliable way none the less, since you've gone as far as implementing and actually using it in an application,
Initially I was considering using it as a search filter along side filters, but it might be for the best to actually list them as hash to ensure the distinction between the 2.

Regards
Zer0xFF

Re: API Key Request for sselph/scraper

Posted: Mon Jun 01, 2020 4:09 pm
by Krokenoster
If I may chime in. For the fun of it I spun up the database which was made available in the "announcement" section. I noticed the games_hashes table was empty, So I created a Windows .NET based tool that will scan a local directory and calculate each Rom's MD5 Checksum and then upload to a PHP REST WEB-API.

Problem was. Some local roms have Hyphens where as the database version of the name has a colon and sometimes the files are named with a numerical number.

I.E
001 Super Mario All Stars
002 Mario. Bros (Note the dot)

This makes it somewhat challenging. Using MySQL like '%%' or combinations of that and MYSQL fulltext search is not accurate enough.
So instead I created a search engine using an algorithm that will rank game titles using TF-IDF which is commonly used in machine learning.

It works quite well If I may say so my self. Still a work in progress as the tool used to upload the hashes is not in a user friendly state.

If you would like to take a look , see below link
https://www.retroskraper.com/project_game_hash.php

Here's a link to test the search engine if you would like to test it out.
https://www.retroskraper.com/tfidf_sear ... laystation
Tip: When searching pretend the term that you are entering is a file name of a rom, because ultimately the search engine will search based on the file name.

Also having multiple hashes for a rom is not necessarily a bad thing. But these hashes need to go to a "staging" table so that it can be processed after the fact incase the search engine messed up.