Notepad showing gibberish for Japanese SRT file


biyakuga

Member
Member
Local time
5:33 PM
Posts
40
OS
Windows 11 Pro Version 23H2 (OS Build 22631. 3374)
When I am trying to open an srt file that contain Japanese translation I see gibberish text (unknown symbol instead the text)

Things I already tried:
1. I Installed Japanese language package

2. I edited the settings like this and I restarted my pc:
1707529308558.png

3. I tried to save the notepad file as UTF-8 instead of ANSI

If I open the srt file with Notepad++ I see it perfectly fine therefore, the only way I was able to solve it was by open the srt file with Notepad++ then copy the content to new Notepad file and save it as UTF-8.

I am trying to understand why this happening and how can I solve it so I will not need to do this trick over and over again for each movie.

I attached the srt file to the discussion
 
Windows Build/Version
Windows 11 Pro 23H2

Attachments

  • homefront.(2013).jpn.1cd.(5574976).zip
    24.8 KB · Views: 2

My Computer

System One

  • OS
    Windows 11 Pro Version 23H2 (OS Build 22631. 3374)
    Computer type
    PC/Desktop
    Manufacturer/Model
    Custome Built
    CPU
    intel i7 13th 13700k
    Motherboard
    ROG STRIX Z790-F GAMING WIFI
    Memory
    VENGEANCE® 64GB (2x32GB) DDR5 DRAM 5600MT/s CL40 Memory Kit
    Graphics Card(s)
    NVidia founders edition 3080 ti
    Monitor(s) Displays
    27" Odyssey QHD 165Hz 1ms HDR10 Gaming Monitor
    Screen Resolution
    2k
    Keyboard
    Logitech G512
    Mouse
    Logitech G502 Hero
    Internet Speed
    1GB Fibers
    Browser
    Edge
    Antivirus
    ESET Samrt Security
When I am trying to open an srt file that contain Japanese translation I see gibberish text (unknown symbol instead the text)

Things I already tried:
1. I Installed Japanese language package

2. I edited the settings like this and I restarted my pc:
View attachment 86622

3. I tried to save the notepad file as UTF-8 instead of ANSI

If I open the srt file with Notepad++ I see it perfectly fine therefore, the only way I was able to solve it was by open the srt file with Notepad++ then copy the content to new Notepad file and save it as UTF-8.

I am trying to understand why this happening and how can I solve it so I will not need to do this trick over and over again for each movie.

I attached the srt file to the discussion
Hi there

@biyakuga

In the settings -->Add optional features--->


Screenshot 2024-02-10 091022.png


post your original subtitle film the.srt - before translation and I'll see what it looks like -- although I don't understand japanese I willknow when I've got invalid or garbled characters or symbolds

Note also there's a load of excellent subtitle translation sites - Free and translates almost any language to any other for .sub and .srt subtitles. Translation is pretty good though -- A.I has improved machine translation so much these days that it's almost at PRO level. Saves a load of manual translation.

E.g (these places usually use google translate as their main engine --it's pretty good these days).


quite a few other ones around too. Often also if the program / film doesn't have subtitles - there's often subtitles on opensubtitles.org in English.Download that and then do the translate.


cheers
jimbo
 
Last edited:

My Computer

System One

  • OS
    Windows XP,7,10,11 Linux Arch Linux
    Computer type
    PC/Desktop
    CPU
    2 X Intel i7
The Shift-JIS-To-UTF-8 converter should now download without any false-positive detection from Windows Defender as long as you have definitions 1.403.3529.0 or higher.
It happen in other languages too such as Spanish/Chinese/Arabic etc.

Maybe you have more permeant solution?
 

My Computer

System One

  • OS
    Windows 11 Pro Version 23H2 (OS Build 22631. 3374)
    Computer type
    PC/Desktop
    Manufacturer/Model
    Custome Built
    CPU
    intel i7 13th 13700k
    Motherboard
    ROG STRIX Z790-F GAMING WIFI
    Memory
    VENGEANCE® 64GB (2x32GB) DDR5 DRAM 5600MT/s CL40 Memory Kit
    Graphics Card(s)
    NVidia founders edition 3080 ti
    Monitor(s) Displays
    27" Odyssey QHD 165Hz 1ms HDR10 Gaming Monitor
    Screen Resolution
    2k
    Keyboard
    Logitech G512
    Mouse
    Logitech G502 Hero
    Internet Speed
    1GB Fibers
    Browser
    Edge
    Antivirus
    ESET Samrt Security
Post examples.
Here I attached Arabic subtitle like with the Japanese I can not see the text only squares with questions mark If I open the srt file with Notepad++ I see it perfectly fine the problem only presented with Notepad.
 

Attachments

  • homefront.(2013).ara.1cd.(7535558).zip
    27.6 KB · Views: 1

My Computer

System One

  • OS
    Windows 11 Pro Version 23H2 (OS Build 22631. 3374)
    Computer type
    PC/Desktop
    Manufacturer/Model
    Custome Built
    CPU
    intel i7 13th 13700k
    Motherboard
    ROG STRIX Z790-F GAMING WIFI
    Memory
    VENGEANCE® 64GB (2x32GB) DDR5 DRAM 5600MT/s CL40 Memory Kit
    Graphics Card(s)
    NVidia founders edition 3080 ti
    Monitor(s) Displays
    27" Odyssey QHD 165Hz 1ms HDR10 Gaming Monitor
    Screen Resolution
    2k
    Keyboard
    Logitech G512
    Mouse
    Logitech G502 Hero
    Internet Speed
    1GB Fibers
    Browser
    Edge
    Antivirus
    ESET Samrt Security
Try this:
Yes it fixed the problem but I am trying to avoiding using tools like I said in my post I can do the same with Notepad++
I would like to know how to fix Notepad to display text right
 

My Computer

System One

  • OS
    Windows 11 Pro Version 23H2 (OS Build 22631. 3374)
    Computer type
    PC/Desktop
    Manufacturer/Model
    Custome Built
    CPU
    intel i7 13th 13700k
    Motherboard
    ROG STRIX Z790-F GAMING WIFI
    Memory
    VENGEANCE® 64GB (2x32GB) DDR5 DRAM 5600MT/s CL40 Memory Kit
    Graphics Card(s)
    NVidia founders edition 3080 ti
    Monitor(s) Displays
    27" Odyssey QHD 165Hz 1ms HDR10 Gaming Monitor
    Screen Resolution
    2k
    Keyboard
    Logitech G512
    Mouse
    Logitech G502 Hero
    Internet Speed
    1GB Fibers
    Browser
    Edge
    Antivirus
    ESET Samrt Security
Yes it fixed the problem but I am trying to avoiding using tools like I said in my post I can do the same with Notepad++
I would like to know how to fix Notepad to display text right
Notepad is not broken. There is nothing to "fix". What you're asking for is to extend the capabilities of Notepad. That cannot be done.

The files you provided are encoded in formats that Notepad does not support. The Japanese one is encoded as Shift JIS and the Arabic one is encoded as code page 1256. The only encoding schemes that Notepad supports are ANSI, UTF-8, UTF-8 BOM, UTF-16 LE, UTF-16 BE.

Notepad++ supports a very long list of encoding schemes and it auto-detects the encoding. That's not a simple thing to do. Encodings, such as Shift JIS and code page 1256, do not have any signature or BOM (byte order mark). To determine the encoding, the file has to be read and statistically analyzed to determine the most likely encoding. Notepad++ does that and so does the tool I provided (using a free encoding detection library DLL).

In summary, the best you can do is convert those files to UTF-8 so that they can be used with Notepad or any other software that supports UTF-8 (which is almost everything).
 

My Computer

System One

  • OS
    Windows 10/11
    Computer type
    Laptop
    Manufacturer/Model
    Acer
Notepad is not broken. There is nothing to "fix". What you're asking for is to extend the capabilities of Notepad. That cannot be done.

The files you provided are encoded in formats that Notepad does not support. The Japanese one is encoded as Shift JIS and the Arabic one is encoded as code page 1256. The only encoding schemes that Notepad supports are ANSI, UTF-8, UTF-8 BOM, UTF-16 LE, UTF-16 BE.

Notepad++ supports a very long list of encoding schemes and it auto-detects the encoding. That's not a simple thing to do. Encodings, such as Shift JIS and code page 1256, do not have any signature or BOM (byte order mark). To determine the encoding, the file has to be read and statistically analyzed to determine the most likely encoding. Notepad++ does that and so does the tool I provided (using a free encoding detection library DLL).

In summary, the best you can do is convert those files to UTF-8 so that they can be used with Notepad or any other software that supports UTF-8 (which is almost everything).

I see thank you so much for the detail explanation.

One more thing please I would like to know why Save As in Notepad to UTF-8 does not work?
If for example, I take the Japanese subtitle and try to save it from Notepad as UTF-8 like I saw in some solutions, would not work here and the new UTF-8 format file still shows the subtitles as [?]
 

My Computer

System One

  • OS
    Windows 11 Pro Version 23H2 (OS Build 22631. 3374)
    Computer type
    PC/Desktop
    Manufacturer/Model
    Custome Built
    CPU
    intel i7 13th 13700k
    Motherboard
    ROG STRIX Z790-F GAMING WIFI
    Memory
    VENGEANCE® 64GB (2x32GB) DDR5 DRAM 5600MT/s CL40 Memory Kit
    Graphics Card(s)
    NVidia founders edition 3080 ti
    Monitor(s) Displays
    27" Odyssey QHD 165Hz 1ms HDR10 Gaming Monitor
    Screen Resolution
    2k
    Keyboard
    Logitech G512
    Mouse
    Logitech G502 Hero
    Internet Speed
    1GB Fibers
    Browser
    Edge
    Antivirus
    ESET Samrt Security
I see thank you so much for the detail explanation.

One more thing please I would like to know why Save As in Notepad to UTF-8 does not work?
If for example, I take the Japanese subtitle and try to save it from Notepad as UTF-8 like I saw in some solutions, would not work here and the new UTF-8 format file still shows the subtitles as [?]
Notepad cannot decode those files, so it cannot convert them correctly to UTF-8. If you started with a UTF-16 Japanese file, for example, Notepad could convert that to UTF-8. But when you start with a Shift JIS file, Notepad decodes it as ANSI resulting in the wrong characters. That's not a bug. It's a limitation. To correctly decode Shift JIS, code page 1256, etc. requires the app to have an encode/decode table for that format and either ask the user to select the correct format or use an analysis algorithm to detect the encoding. Only Microsoft could, in theory, add such capabilities to Notepad.

If you could provide a link to the "solutions" that you saw, I can take a look and see if there is anything more to it.
 

My Computer

System One

  • OS
    Windows 10/11
    Computer type
    Laptop
    Manufacturer/Model
    Acer
Back
Top Bottom