Actually, that's more efficient than interleaving key/value pairs, because of memory caching. If you interleave the data, then cachelines include all the key and value data, which you then read past to get just the keys. If you separate out the keys, then the CPU's memory lookahead works much better: you're only reading through the keys, then doing a single table lookup to get the value once you known you have a valid key.
Obviously, this effect is more pronounced for big data sets or if the size of the values is large. e.g. an empty C++ STL std::string is ~24 bytes. Presumably, the actual contents of the string is stored elsewhere, but 24 bytes is the minimal overhead for the concrete part of the std::string.
So if you have int keys and std::string values, the interleaving them means you're reading 28 bytes to get 1 key to compare. It's about 200 clock cycles to load a cache line of 64 bytes, so you're getting maybe 90-100 clock cycles per key you check. If you have only the int keys in an array, then you can load 16 of them in the 200 clock cycles, and 12 clock cycles is plenty of time to compare 2 ints and increment a pointer. So that would be a speedup factor of about 8 times just by having separate arrays for int keys and std::string values.
But of course, linear arrays are pretty brute-force for this type of problem. If the keys are sortable in any way, you can exploit that: since dictionary keys don't have to be in any order, you can store them in a sorted data structure to begin with, which can be a binary tree, or a hashing system if you want.