Bay 12 Games Forum

Please login or register.

Login with username, password and session length
Advanced search  
Pages: 1 ... 316 317 [318] 319 320 ... 796

Author Topic: if self.isCoder(): post() #Programming Thread  (Read 882627 times)

Rose

  • Bay Watcher
  • Resident Elf
    • View Profile
Re: if self.isCoder(): post() #Programming Thread
« Reply #4755 on: July 26, 2013, 10:13:04 pm »

This may just be my current migraine talking, but what about hashing a string made by concatenating the two integers?  Eg. if your pair contains 45 and -67, hash "45-67"?

If you use the sign of the second number as the separator between the first and the second, I'm having a hard time finding a reason why this wouldn't result in a suitable hash.

how about long Pair = int1 * MAX_INT + int2?

{int1=1, int2=MAX_INT} would produce the same value as {int1=2, int2=0}.
Sorry, on review, it should be (MAX_INT+1), which is not possible for 2 to be.

how about long Pair = int1 * MAX_INT + int2?
uint_64 might work o_O?
If Jaoa's idea won't work for some reason, someone tell me! :D
uint64_t is the same as unsigned long.

Note that my method will not work with negative numbers.

MagmaMcFry's union idea would be better, in that case.
« Last Edit: July 26, 2013, 10:15:58 pm by Japa »
Logged

RulerOfNothing

  • Bay Watcher
    • View Profile
Re: if self.isCoder(): post() #Programming Thread
« Reply #4756 on: July 26, 2013, 10:18:59 pm »

If the points are fairly evenly spread out, couldn't you just use one of the co-ordinates as the hash value?
Logged

Skyrunner

  • Bay Watcher
  • ?!?!
    • View Profile
    • Portfolio
Re: if self.isCoder(): post() #Programming Thread
« Reply #4757 on: July 26, 2013, 10:21:24 pm »

I edited my hash to show an example.
I think that if you bitshift a 32 bit integer and bit-add the other one to it it should work for all ints.
Eg,
uint64_t hash = (uint64_t)n1 << 32 | n2

If the points are fairly evenly spread out, couldn't you just use one of the co-ordinates as the hash value?

You mean use either x or y as the hash?
« Last Edit: July 26, 2013, 10:24:55 pm by Skyrunner »
Logged

bay12 lower boards IRC:irc.darkmyst.org @ #bay12lb
"Oh, they never lie. They dissemble, evade, prevaricate, confoud, confuse, distract, obscure, subtly misrepresent and willfully misunderstand with what often appears to be a positively gleeful relish ... but they never lie" -- Look To Windward

Rose

  • Bay Watcher
  • Resident Elf
    • View Profile
Re: if self.isCoder(): post() #Programming Thread
« Reply #4758 on: July 26, 2013, 10:25:21 pm »

Eeeh... the bitshifting is only guaranteed to work if both of the int32s are actually uint32s.

seriously, use the union, it's safer.
Logged

RulerOfNothing

  • Bay Watcher
    • View Profile
Re: if self.isCoder(): post() #Programming Thread
« Reply #4759 on: July 26, 2013, 10:27:22 pm »

You mean use either x or y as the hash?
Yes.
Logged

Skyrunner

  • Bay Watcher
  • ?!?!
    • View Profile
    • Portfolio
Re: if self.isCoder(): post() #Programming Thread
« Reply #4760 on: July 26, 2013, 10:35:49 pm »

You mean use either x or y as the hash?
Yes.
Well, then, (10,11) and (10,255) would have the same hash. Using either x or y is like projecting a plane onto a line--information is guaranteed to be lost.

Eeeh... the bitshifting is only guaranteed to work if both of the int32s are actually uint32s.

seriously, use the union, it's safer.
I don't see how the union would help things...

Hmm, maybe I can use the flattened vector offsets as the hash. (59,10) on a 256×256 map is vector[59×256+10], for example.
Logged

bay12 lower boards IRC:irc.darkmyst.org @ #bay12lb
"Oh, they never lie. They dissemble, evade, prevaricate, confoud, confuse, distract, obscure, subtly misrepresent and willfully misunderstand with what often appears to be a positively gleeful relish ... but they never lie" -- Look To Windward

Rose

  • Bay Watcher
  • Resident Elf
    • View Profile
Re: if self.isCoder(): post() #Programming Thread
« Reply #4761 on: July 26, 2013, 11:01:28 pm »

Union will help things by solving your problem in its entirety in one step.
Logged

MadocComadrin

  • Bay Watcher
  • A mysterious laboratory goblin!
    • View Profile
Re: if self.isCoder(): post() #Programming Thread
« Reply #4762 on: July 27, 2013, 09:15:55 am »

If your coordinates are all positive then the flattened array value would work.

The union solution is a very nice way of handling it too. It essentially lays out the values in memory like so:

0-31 32-63
  x      y

And allows you to interpret it as either a struct of two integers x and y or and one big long (ie, the number you would get by reading 0-63 as a long).
Logged

Mego

  • Bay Watcher
  • [PREFSTRING:MADNESS]
    • View Profile
Re: if self.isCoder(): post() #Programming Thread
« Reply #4763 on: July 27, 2013, 11:51:10 am »

If your coordinates are all positive then the flattened array value would work.

The union solution is a very nice way of handling it too. It essentially lays out the values in memory like so:

0-31 32-63
  x      y

And allows you to interpret it as either a struct of two integers x and y or and one big long (ie, the number you would get by reading 0-63 as a long).

Code: [Select]
union Coord {
    int x;
    long y;
};

Coord make_coord(int x, int y) {
    Coord c;
    c.y = y + long(x) << sizeof(int);
}

Sleepy brain says that code will work.

MagmaMcFry

  • Bay Watcher
  • [EXISTS]
    • View Profile
Re: if self.isCoder(): post() #Programming Thread
« Reply #4764 on: July 27, 2013, 02:08:49 pm »

Here's some actual working code with a short demo. I scrapped the union idea, because on 32-bit systems, the standard hash<long long> function just drops the highest significant bits, which just so happen to contain one of the coordinates, which means that method would result in a damn lot of hash collisions. What I have here is just as fast in calculation and (if I'm right about how the container uses the hash values) it actually makes the container faster.
« Last Edit: July 27, 2013, 03:57:12 pm by MagmaMcFry »
Logged

MorleyDev

  • Bay Watcher
  • "It is not enough for it to just work."
    • View Profile
    • MorleyDev
Re: if self.isCoder(): post() #Programming Thread
« Reply #4765 on: July 27, 2013, 04:55:58 pm »

You know, a fun misconception I've seen a lot lately is that lists are automatically better than vectors when you want to do a lot of pushes and pops, or insertions at random points.

This often proves wrong because of the wonders of CPU-level caching. Always profile before going away from vector, except when you need iterators that will remain valid so long as the item they point to isn't removed and are therefore forced to use list. It turns out the frequent CPU cache misses are too expensive when you iterate over lists for them to stack up well against vectors in many situations where you'd think lists should be best.

Don't blame the tool if you don't use it right.

I wasn't blaming the tool, I was pointing out mistakes that result in "undefined behaviour" are the riskiest part of C++ programming, and where at best a lot of cross-platform compatibility is halted, and at worst a lot of difficult-to-track-down crashes come from.

Whilst often the same behaviour in a JIT'd language will cause a crash, but they often do things to get around that such as being explicitly specified to throw an exception with an occasionally helpful error message. You simply have to be aware how C++, not being JIT'd and being a lot more leaned towards power without restraints, doesn't give you that luxury.

For example, an iterator to a vector can in theory be invalidated as soon as that vector is modified (push or pop). It'll bite you in the arse eventually, so just never store iterators to a vector for a period in which a vector is mutable. If you do, then C#/Java would throw a somewhat readable exception here, whilst a program written in C++ will experience a seemingly random crash at best.

This kind of thing is why C++ fills the niche it does, and they are both the languages greatest strength and greatest weakness.
« Last Edit: July 27, 2013, 05:30:22 pm by MorleyDev »
Logged

Mego

  • Bay Watcher
  • [PREFSTRING:MADNESS]
    • View Profile
Re: if self.isCoder(): post() #Programming Thread
« Reply #4766 on: July 27, 2013, 05:44:51 pm »

Don't blame the tool if you don't use it right.

I wasn't blaming the tool, I was pointing out mistakes that result in "undefined behaviour" are the riskiest part of C++ programming, and where at best a lot of cross-platform compatibility is halted, and at worst a lot of difficult-to-track-down crashes come from.

Whilst often the same behaviour in a JIT'd language will cause a crash, but they often do things to get around that such as being explicitly specified to throw an exception with an occasionally helpful error message. You simply have to be aware how C++, not being JIT'd and being a lot more leaned towards power without restraints, doesn't give you that luxury.

For example, an iterator to a vector can in theory be invalidated as soon as that vector is modified (push or pop). It'll bite you in the arse eventually, so just never store iterators to a vector for a period in which a vector is mutable. If you do, then C#/Java would throw a somewhat readable exception here, whilst a program written in C++ will experience a seemingly random crash at best.

This kind of thing is why C++ fills the niche it does, and they are both the languages greatest strength and greatest weakness.

I realize now that my comment was a bit ambiguous; it was meant towards dreadmullet, because he was bitching that the language operated exactly how it should, citing your post as further reasoning in addition to the link I provided.

Here's some actual working code with a short demo. I scrapped the union idea, because on 32-bit systems, the standard hash<long long> function just drops the highest significant bits, which just so happen to contain one of the coordinates, which means that method would result in a damn lot of hash collisions. What I have here is just as fast in calculation and (if I'm right about how the container uses the hash values) it actually makes the container faster.

Why... Why are you using 2654435769? This is bad for 2 reasons:

1. 2^31.305758086099136 is not 2^31-1. That would be 2147483647.
2. Hardcoding the value is bad. The C++ specification only states that each integral and floating-point data type must be at least as large as the next-smallest, with char always being 1 byte. Theoretically, sizeof(int) could be one. Instead, use std::numeric_limits<int>::max() instead, from the limits header.
« Last Edit: July 27, 2013, 06:01:47 pm by Mego »
Logged

MagmaMcFry

  • Bay Watcher
  • [EXISTS]
    • View Profile
Re: if self.isCoder(): post() #Programming Thread
« Reply #4767 on: July 27, 2013, 06:13:44 pm »

Here's some actual working code with a short demo. I scrapped the union idea, because on 32-bit systems, the standard hash<long long> function just drops the highest significant bits, which just so happen to contain one of the coordinates, which means that method would result in a damn lot of hash collisions. What I have here is just as fast in calculation and (if I'm right about how the container uses the hash values) it actually makes the container faster.

Why... Why are you using 2654435769? This is bad for 2 reasons:

1. 2^31.305758086099136 is not 2^31-11. That would be 2147483647.
2. Hardcoding the value is bad. The C++ specification only states that each integral and floating-point data type must be at least as large as the next-smallest, with char always being 1 byte. Theoretically, sizeof(int) could be one. Instead, use std::numeric_limits<int>::max() instead, from the limits header.
1) Yes, and what is special about 2^31-11? Why would I want 2^31-11?
2) I am basing this on the reasonable assumption that Skyrunner is compiling her program on a machine where sizeof(int)==32. If you don't want it hardcoded, just use the nearest odd integer to 256sizeof(int)/goldenRatio.

The reason I chose that number is because I'm reasonably sure that the unsorted_map implementation is storing its elements in buckets determined by the last few bytes of the elements' hash values. The number there is there to ensure that two nearby coordinates have minimal chance to get sorted into the same bucket (which would reduce hash access speed), regardless of the amount of buckets there is.
Logged

Mego

  • Bay Watcher
  • [PREFSTRING:MADNESS]
    • View Profile
Re: if self.isCoder(): post() #Programming Thread
« Reply #4768 on: July 27, 2013, 06:25:53 pm »

Here's some actual working code with a short demo. I scrapped the union idea, because on 32-bit systems, the standard hash<long long> function just drops the highest significant bits, which just so happen to contain one of the coordinates, which means that method would result in a damn lot of hash collisions. What I have here is just as fast in calculation and (if I'm right about how the container uses the hash values) it actually makes the container faster.

Why... Why are you using 2654435769? This is bad for 2 reasons:

1. 2^31.305758086099136 is not 2^31-11. That would be 2147483647.
2. Hardcoding the value is bad. The C++ specification only states that each integral and floating-point data type must be at least as large as the next-smallest, with char always being 1 byte. Theoretically, sizeof(int) could be one. Instead, use std::numeric_limits<int>::max() instead, from the limits header.
1) Yes, and what is special about 2^31-11? Why would I want 2^31-11?
2) I am basing this on the reasonable assumption that Skyrunner is compiling her program on a machine where sizeof(int)==32. If you don't want it hardcoded, just use the nearest odd integer to 256sizeof(int)/goldenRatio.

The reason I chose that number is because I'm reasonably sure that the unsorted_map implementation is storing its elements in buckets determined by the last few bytes of the elements' hash values. The number there is there to ensure that two nearby coordinates have minimal chance to get sorted into the same bucket (which would reduce hash access speed), regardless of the amount of buckets there is.

1) That was a typo that I corrected; it should've been 2^31-1, which is the value of INT_MAX on most machines.
2) If Skyrunner has a machine where sizeof(int)==32, I want it. You're thinking of sizeof(int)==4. But, more to the point, no matter how reasonable an assumption is, it can still make an ass out of you and me. There is a very easy way programmatically to determine INT_MAX on a computer. It's called INT_MAX (or the nicer std::numeric_limits<int>::max()). Making the code more system-independent costs absolutely nothing.

Also, it's pretty easy to make a hash collision with that code. Granted, the two points are nowhere near each other.

MagmaMcFry

  • Bay Watcher
  • [EXISTS]
    • View Profile
Re: if self.isCoder(): post() #Programming Thread
« Reply #4769 on: July 27, 2013, 07:05:59 pm »

2) If Skyrunner has a machine where sizeof(int)==32, I want it. You're thinking of sizeof(int)==4.
Code: [Select]
    c.y = y + long(x) << sizeof(int);
Goddamnit, you're infectious. :P

Quote
Also, it's pretty easy to make a hash collision with that code. Granted, the two points are nowhere near each other.
[warning]Do not use this hash for cryptographic purposes.[/warning]
Logged
Pages: 1 ... 316 317 [318] 319 320 ... 796