1. Multi-timbral means the ability for the instrument to have 2 or more different sounds playing at the same time (timbre = voice). Standard synth comes as 16-part multi-timbral to correspond with the number of channels in midi. 16-part multi-timbral will enable you to play standard midi files on the keyboard or other sequenced tracks. Many stage and digital pianos are at least 2-part multi-timbral - so it lets you layer 2 sounds at the same time (eg piano + strings).
2. Polyphony = the max number of notes a keyboard can play at one instance. If a keyboard is 32 note polyphonic, it can only have 32 notes sounding at any one time. If you hit the 33rd note, one note will get cut off suddenly creating an undesirable effect. 32 notes sounds like a lot, but for complex piano pieces (and when you use the sustain pedal), you easily hit 32 notes. Simple pieces should be OK. Sometimes, depending on the keyboard, one sound may have 2 or more sound elements in it - to achieve a certain effect. If one sound has 2 sound elements, you only need to hit 16 notes before running out of polyphony in a 32-note polyphonic synth. Most keyboards are at least 64 note polyphony. We are seeing more and more 128 note polyphonic keyboards. But of course, many of these keyboards uses up to 4 elements per voice, which means the real polyphony is lower. But having more polyphony allows one to layer more sounds to achieve more complex effects and/or more realism.
3. Controllers come in various forms. The 2 most standard ones in synths control: 1. pitch bend and 2. modulation. Pitch bend, as the name implies, bends the pitch of a note. Standard programming bends the note up and down a tone depending whether you move the controller up or down. You can program it to bend more than a tone - up to a few octaves if you like. Pitch bend is usually spring loaded - ie when you let go, it springs back into the original resting position. Modulation controller is set to whatever parameters you want to. It usually controls LFO (oscillators) but can be set to anything you like (volume etc, or even pitch bend). It is usually not spring loaded. We use it to add vibrato, change timbre of a sound, add the leslie effect to a Hammond organ etc. I find it indispensable to a keyboardist.
Now pitch bend and mod wheels come in various form. One is the wheel form, of which I'm most used to. It can also come as the stick form, of which most Roland keyboards adopted (which I don't quite like). Of course, there's the ribbon form. In the ribbon form, it looks like the wheel but of course, without the wheel itself. It is touch senstive and you control it by simply putting your fingers on it and moving it up and down (or left and right, depending on how it is aligned). In electronic music when you program the mod controller to certain parameters, the ribbon controller is quite helpful (since you can switch between extreme poles very quickly by a touch of a finger). For digital piano, we don't usually do that. We usually change the modulation gradually.
That's my "short" answer to your questions. I think if the SP88X is not multi-timbral, it will also only use 1 element in a voice since by definition, it cannot have 2 or more elements. That means:
1. you can play simple pieces without running into polyphony problems (but don't think about playing Rachmaninoff or Debussy, or other complex pieces).
2. Using only one element in a voice means the sound is only as good as the raw sampled wave, since they cannot layer more sounds to make it more realistic. That can be done, provided a lot of RAM is dedicated to the raw waveform, which is usually not so in hardware. So I'm assuming the sounds in the SP88X to be thinner and not as good as other keyboards, but I may be proved wrong. Not having heard it, I can only go be speculation, which is not fair to the keyboard. So you have to listen to the keyboard and compare with others to judge for yourself.