Can the Encoding API decode a Stream/noncontinuous bytes?

rtc rtp encoding parameters
rtcrtpencodingparameters
webrtc setparameters
textdecoder
mozilla dev api
browser api
rtcrtpparameters
javascript api mdn

Usually we can get a string from a byte[] using something like

var result = Encoding.UTF8.GetString(bytes);

However, I am having this problem: my input is an IEnumerable<byte[]> bytes (implementation can be any structure of my choice). It is not guaranteed a character is within a byte[] (for example, a 2-byte UTF8 char can have its 1st byte in bytes[1][length - 1] and its 2nd byte in bytes[2][0]).

Is there anyway to decode them without merging/copying all the array together? UTF8 is main focus but it is better if other Encoding can be supported. If there is no other solution, I think implementing my own UTF8 reading would be the way.

I plan to stream them using a MemoryStream, however Encoding cannot work on Stream, just byte[]. If merged together, the potential result array may be very large (up to 4GB in List<byte[]> already).

I am using .NET Standard 2.0. I wish I could use 2.1 (as it is not released yet) and using Span<byte[]>, would be perfect for my case!

The Encoding class can't deal with that directly, but the Decoder returned from Encoding.GetDecoder() can (indeed, that's its entire reason for existing). StreamReader uses a Decoder internally.

It's slightly fiddly to work with though, as it needs to populate a char[], rather than returning a string (Encoding.GetString() and StreamReader normally handle the business of populating the char[]).

The problem with using a MemoryStream is that you're copying all of the bytes from one array to another, for no gain. If all of your buffers are the same length, you can do this:

var decoder = Encoding.UTF8.GetDecoder();
// +1 in case it includes a work-in-progress char from the previous buffer
char[] chars = decoder.GetMaxCharCount(bufferSize) + 1;
foreach (var byteSegment in bytes)
{
    int numChars = decoder.GetChars(byteSegment, 0, byteSegment.Length, chars, 0);
    Debug.WriteLine(new string(chars, 0, numChars));
}

If the buffers have different lengths:

var decoder = Encoding.UTF8.GetDecoder();
char[] chars = Array.Empty<char>();
foreach (var byteSegment in bytes)
{
    // +1 in case it includes a work-in-progress char from the previous buffer
    int charsMinSize = decoder.GetMaxCharCount(bufferSize) + 1;
    if (chars.Length < charsMinSize)
        chars = new char[charsMinSize];
    int numChars = decoder.GetChars(byteSegment, 0, byteSegment.Length, chars, 0);
    Debug.WriteLine(new string(chars, 0, numChars));
}

RTCRtpEncodingParameters, An instance of the WebRTC API's RTCRtpEncodingParameters dictionary data, while for receivers, the encoding is being used to decode received data. byte ( or octet) specifying the codec to use for sending the stream; the value This value can only be set when creating the transceiver; after that, this� Decoding a Stream of Bytes. Similar to encoding a string, we can decode a stream of bytes to a string object, using the decode() function. Format: encoded = input_string.encode() # Using decode() decoded = encoded.decode(decoding, errors) Since encode() converts a string to bytes, decode() simply does the reverse.

however Encoding cannot work on Stream, just byte[].

Correct but a StreamReader : TextReader can be linked to a Stream.

So just create that MemoryStream, push bytes in on one end and use ReadLine() on the other. I must say I have never tried that.

TextDecoder, A decoder takes a stream of bytes as input and emits a stream of code points. a decoder for a specific text encoding, such as UTF-8 , ISO-8859-2 decoder, that is a string describing the method the TextDecoder will use. Returns Int32. The number of encoded bytes. Remarks. If the data to be converted is available only in sequential blocks (such as data read from a stream) or if the amount of data is so large that it needs to be divided into smaller blocks, you should use the Decoder or the Encoder provided by the GetDecoder method or the GetEncoder method, respectively, of a derived class.

Working code based on Henk's answer using StreamReader:

    using (var memoryStream = new MemoryStream())
    {
        using (var reader = new StreamReader(memoryStream))
        {
            foreach (var byteSegment in bytes)
            {
                memoryStream.Seek(0, SeekOrigin.Begin);
                await memoryStream.WriteAsync(byteSegment, 0, byteSegment.Length);
                memoryStream.Seek(0, SeekOrigin.Begin);

                Debug.WriteLine(await reader.ReadToEndAsync());
            }
        }
    }

opus, Reader) error; func (s *Stream) Read(pcm []int16) (int, error); func (s *Stream) func (dec *Decoder) DecodeFEC(data []byte, pcm []int16) error. DecodeFEC encoded Opus data into the supplied buffer with forward error correction. When a packet is considered "lost", DecodeFEC can be called on the next packet in order to� Encode simple String into Basic Base 64 format String BasicBase64format= Base64.getEncoder().encodeToString(“actualString”.getBytes()); Explanation: In above code we called Base64.Encoder using getEncoder() and then get the encoded string by passing the byte value of actualString in encodeToString() method as parameter.

Programming with Speex (the libspeex API), There are many parameters that can be set for the Speex encoder, but the most useful Decoding. In order to decode speech using Speex, you first need to: # include nbBytes is the size (in bytes) of that bit-stream, and output_frame is a ( short SPEEX_SET_DTX*: Set discontinuous transmission (DTX) to on (1) or off (0)� You are trying to decode an object that is already decoded. You have a str, there is no need to decode from UTF-8 anymore. Simply drop the .decode('utf-8') part: header_data = data[1][0][1] As for your fetch() call, you are explicitly asking for just the first message. Use a range if you want to retrieve more messages. See the documentation:

codecs — Codec registry and base classes — Python 3.8.5 , Most standard codecs are text encodings, which encode text to bytes, but there are also The stream reader and writers typically reuse the stateless encoder/ decoder to If the replacement is a string, the encoder will encode the replacement. A decoder takes a stream of bytes as input and emits a stream of code points. The TextDecoder interface represents a decoder for a specific text encoding, such as UTF-8, ISO-8859-2, KOI8-R, GBK, etc. A decoder takes a stream of bytes as input and emits a stream of code points. Skip to main content

Encode/Decode Data Assertion, Not all byte sequences express a valid UTF-8 encoding. For example, if you convert a binary file to characters and back using this encoding, it will� The Encoding.GetBytes() method converts a string into a byte array in C#. This article includes a code example of how to convert a C# string into a byte array.

Comments
  • MemoryStream.WriteAsync is always synchronous. There's no point in calling it ever, just use MemoryStream.Write and save some await overhead. Likewise StreamReader.ReadToEndAsync will call MemoryStream.ReadAsync, which is synchronous. Using await here is only adding completely unnecessary overhead.
  • Thanks for your advice. However, we do have principles that we do not depend on implementation. In this case, we simply trust the implementation is doing what is right, as it is not too critical (speed is currently not a problem, but memory space).
  • Another way of looking at this: you have a choice between the synchronous and the asynchronous APIs. Always blindly taking one or the other is silly - you need to weight them up for each situation. Here, it's obvious (even without delving into the implementation) that synchronously copying a few bytes is always going to be faster than doing it asynchronously - either the implementation will spin up a whole new thread (100x slowdown at least), or it will do a synchronous copy but pretend it's asynchronous (still overhead). Blindly using async != trusting the implementation
  • Please post your solution as an answer.
  • @TomBlodget I have just done that. While it is good, I think canton7's solution works better in general.
  • Wonderful! Thanks a lot, this is very helpful.
  • Funny I am writing a custom StreamReader implementation because my business logic requires very weird reading pattern and I totally forgot about StreamReader itself! Yes it should work, I will try it now and post the result soon.
  • Yes it works! I have added the code segment in my question, please copy it over your answer and I will mark it :) EDIT: turned out what I needed to write is a custom Stream, not custom StreamReader!
  • You can post your code as a self-answer. But your quick-test won't deal with broken characters, do point that out.
  • And yes, the MemoryStream only has 1 Position so you can't read/write it like a queue. A petty sometimes.