Author Topic: unicode support request  (Read 5351 times)

Peter

  • Guest
unicode support request
« on: 17 Jun '03 - 12:20 »
Hello.

An option to load filenames with Unicode characters would be usefull to me, and probably for other people too.  Some files on my hdd have none standard ANSI characters and those can't currently be loaded by BASS.

Something like :
BASS_StreamCreateFile(FALSE,L"afile.mp3",0,0,BASS_FILE_UNICODE);

Is this possible to implement something similar?

Ian @ un4seen

  • Administrator
  • Posts: 21861
Re: unicode support request
« Reply #1 on: 17 Jun '03 - 14:33 »
It shouldn't be a problem to quickly add the option for BASS 1.8a :)

Peter

  • Guest
Re: unicode support request
« Reply #2 on: 20 Jun '03 - 09:13 »
Thanks for the new feature.
It's working great!

Irrational86

  • Posts: 960
Re: unicode support request
« Reply #3 on: 20 Jun '03 - 15:13 »
One question Ian, if you implement UNICODE char support, why dont you just make it default, instead of having to use a Flag?? Whats the difference of using a flag as to not using it with regular non-UNICODE filenames?

Ian @ un4seen

  • Administrator
  • Posts: 21861
Re: unicode support request
« Reply #4 on: 20 Jun '03 - 17:25 »
Unicode characters are 16-bit, while ANSI characters are 8-bit. So different processing is required for each, and BASS needs to know what type of string the parameter is - hence the flag :)

Irrational86

  • Posts: 960
Re: unicode support request
« Reply #5 on: 20 Jun '03 - 20:33 »
And what if you always use the flag? will it affect the loading of non-UNICODE filenames?? or affect anything at all?

DanaPaul

  • Posts: 335
Re: unicode support request
« Reply #6 on: 20 Jun '03 - 22:14 »

Quote

Unicode characters are 16-bit, while ANSI characters are 8-bit. So different processing is required for each, and BASS needs to know what type of string the parameter is - hence the flag :)


Well, the Unicode MP3.ID3v2 tags that I've come across have the first 2 bytes of the string set to indicate a Unicode string, therefore I've been able to detect these (rare) Unicode strings on the fly.  Would something like this accomodate most Unicode instances?

In Delphi speak...

function FixStr(StrToFix: string): string;
begin
 Result := StrToFix;
 if Length(Result) > 1 then begin
 if (Ord(Result[1]) = 255) and (Ord(Result[2]) = 254) then
   begin
   if Length(Result) > 2 then begin
     Result := System.Copy(Result, 3, Length(Result));
     s := WideCharToString(PWideChar(Result));
     Result := StrPas(PChar(s));
     end
   else Result := '';
   end;
 end;
end;


Ian @ un4seen

  • Administrator
  • Posts: 21861
Re: unicode support request
« Reply #7 on: 21 Jun '03 - 12:16 »
Here's the problem... for example, if you have the Unicode string "ABC", in byte form that is 'A',0,'B',0,'C',0,0,0 - BASS can't know if that's meant to be a Unicode "ABC" or an ANSI "A". It could check beyond the trailing 0, but that's asking for access violations :)

Irrational86

  • Posts: 960
Re: unicode support request
« Reply #8 on: 21 Jun '03 - 13:36 »
Ok, ok, now i get it :laugh:...thanks a lot Ian

DanaPaul

  • Posts: 335
Re: unicode support request
« Reply #9 on: 22 Jun '03 - 13:33 »

Quote

Here's the problem... for example, if you have the Unicode string "ABC", in byte form that is 'A',0,'B',0,'C',0,0,0


I haven't had any problem with Unicode strings that set the first 2 bytes (ahem, one DoubleByte or Word) to FFFE indicating a Unicode string.  Flags scattered about the file or flagged function parameters not needed.

However, you can do as you please, you're the boss :)

Ian @ un4seen

  • Administrator
  • Posts: 21861
Re: unicode support request
« Reply #10 on: 23 Jun '03 - 14:40 »
Quote
I haven't had any problem with Unicode strings that set the first 2 bytes (ahem, one DoubleByte or Word) to FFFE indicating a Unicode string.

That's only in ID3v2 tags, to indicate the byte order... you can actually check the previous byte to see if it's a Unicode string, 1 = yes :)

DanaPaul

  • Posts: 335
Re: unicode support request
« Reply #11 on: 23 Jun '03 - 18:24 »

Quote
That's only in ID3v2 tags, to indicate the byte order... you can actually check the previous byte to see if it's a Unicode string, 1 = yes :)


Oh?  I'll have to look into that.  The flag is part of the Tag frame, eh?

I haven't parsed WMA files (without Bass) yet, and I don't plan to. :)

Ian @ un4seen

  • Administrator
  • Posts: 21861
Re: unicode support request
« Reply #12 on: 23 Jun '03 - 23:02 »
Quote
Oh?  I'll have to look into that.  The flag is part of the Tag frame, eh?

Yep...
Quote
If nothing else is said a string is represented as ISO-8859-1 characters in the range $20 - $FF. Such strings are represented as <text string>, or <full text string> if newlines are allowed, in the frame descriptions. All Unicode strings use 16-bit unicode 2.0 (ISO/IEC 10646-1:1993, UCS-2). Unicode strings must begin with the Unicode BOM ($FF FE or $FE FF) to identify the byte order.

All numeric strings and URLs are always encoded as ISO-8859-1. Terminated strings are terminated with $00 if encoded with ISO-8859-1 and $00 00 if encoded as unicode. If nothing else is said newline character is forbidden. In ISO-8859-1 a new line is represented, when allowed, with $0A only. Frames that allow different types of text encoding have a text encoding description byte directly after the frame size. If ISO-8859-1 is used this byte should be $00, if Unicode is used it should be $01. Strings dependent on encoding is represented as <text string according to encoding>, or <full text string according to encoding> if newlines are allowed. Any empty Unicode strings which are NULL-terminated may have the Unicode BOM followed by a Unicode NULL ($FF FE 00 00 or $FE FF 00 00).

See the specs at www.id3.org for full details :)

DanaPaul

  • Posts: 335
Re: unicode support request
« Reply #13 on: 24 Jun '03 - 00:46 »

Quote
See the specs at www.id3.org for full details :)


No need to lobby for self identifying strings, already specified in the standards. :)

A quote from id3.org... Hee hee...

(Software which does not behave according to items 1 and 2 above are categorically deemed "broken." Microsoft's Media Player is an example of such software.)