Also if you want to get further into a language, basically being thrown into the deep end is apparently one of the best ways to go about it. You're basically forced to learn the language, so you pick it up pretty quickly.
Note on this; being thrown into the deep end for language is the best way to go from knowing absolutely nothing to "Soon bus coming. Me want know how many till arriving." (i.e., understandable and able to get your point across, but people will probably look at you funny). It's one of the
worst ways to reach a natively fluent and literate level in a language; there's a reason why we have to take grammar classes even in our native languages, because "just picking it up as you go" is the type of approach that takes decades of corrections before you fully grasp everything. You'll learn more about the proper way to speak in a month of regular classes than you will in years of osmosis.
In short:
"Deep end": Good at learning basics and forming a "workable" vocabulary. Great for learning the "what" of what you say. Bad for not sounding like a caveman.
Traditional Class: Good at learning grammar and parts of speech (conjunctions, contractions, etc.). Great for learning the "how" of how you talk. Bad for learning basics and vocabulary.
Suppose this is a sound file, converted into a .raw, opened on Photoshop to be edited somewhat by writing on it (yes, seriously. See Glitch Audio). How do audio players play the file if I were to scratch the third top most pixel to the left? Do they play it from left to right, escalating downwards? Play it all at once from the left side to the right? Top-down? Bottom-up?
Generally pixel data is given from left to right and top to bottom. However this is not necessarily standardized, and as such there are actually two fields you can set (Pixel Order and Scanline Order), that can be used to invert either of these; some applications actually require you to read the image data from right to left or from bottom to top (or both). Specifically what's going to determine how it's read is going to be bytes number 19 and 20 of the file (for Oracle standard RAW image data that is). Since any audio files using the MPEG audio format (including mp3 files) use a much shorter header (around ~4 bytes), this means that the actual order in how the pixels relate to the data is going to be determined by whatever bytes 15 and 16 of your audio data are. Specifically if you're going to be handling any RAW images (which that linked file is not, it's a .png there), then most likely you're going to be able to see the actual settings at some location in your editor (alternatively you could crack open the raw hex and take a peek at the first 20 or so bytes of the file).
Of course, that's just based on the Oracle RAW image data format, which we are lucky enough to have
documentation for. All bets go out the window if you are using a different file type, many of which have no view able documentation given by the companies that made them (that way they can sell you software to work with them). ".raw", for example, is made by Panasonic, and a cursory googling is revealing a grand total of 0 documentation on the file header formats. I can guarantee you that it's certainly possible for your data to be read some other way, and in fact the way it's read is almost certainly going to depend on the early bits of any audio you convert into an image file (since audio headers are somewhat documented, if only by people who liked messing with them, and they are as a rule much shorter than image headers), but beyond that it's a total crap shoot in what you are going to get (heck, it's even possible you could end up with weird things depending on what the image editor and audio player you are trying to use actually support, on top of what file type you are using to store the image data).