Let's say I have filenames that are formatted differently. I want to be able to extract certain aspects from said filename like a human would; pattern recognition.
I can brute-force myself through with regular expressions but that's not what I'm after. Let's say I have these 4 strings:
[MAS] Hayate no Gotoku!! 20 [BD 720p] [21D138F8].mkv
[Leopard-Raws] Akatsuki no Yona - 05 RAW (MX 1280x720 x264 AAC).mp4
[BLAST] Wolf Girl and Black Prince - 05 [720p] [C1252A5E].mkv
[sage]_Mobile_Suit_Gundam_AGE_-_36_[720p][10bit][45C9E0D0].mkv
As you can see all these filenames have a certain pattern in them but are not quite the same. So a silver bullet regular expression wouldn't cut it. Instead, I want to look at computational intelligence techniques such as ANN's or another smart idea to solve this problem.
Let's say we want to extract the filenames. Humans would return these values:
Hayate no Gotoku!!
Akatsuki no Yona
Wolf Girl and Black Prince
Mobile Suit Gundam AGE
Or episode numbers: 20, 05, 05, 36. You get where I'm going with this.
What suggested techniques would be useful to achieve the desired result or is this something that is being researched at universities and still has no solution?