Topic: Scheduling Algorithm (Read 1033 times)

timmeh · « **on:** March 01, 2011, 08:45:36 pm »

Hey, algorithm challenge for anyone who's up to it xP I work in the Provost Office/Academic Affairs at the college I am currently attending, and have been put in charge of developing some way to better manage the scheduling of finals for future semesters. It's been done by hand up until this point, but now they want either a standard to work off of to make doing it by hand quicker, or if possible, an automated solution. I threw together a sort of standard, but the problem is, the courses shift enough between semesters that there really is no way to build a definite standard.

The finals have to be scheduled such that it is impossible for any student to have two at the same time, and as such, two classes can only share a final time-slot if their meeting times over-lap (preventing the student from being in both in the same semester). This is done by taking a list of standardized class times, assigning them final time-slots (thus handling the majority of the courses), then taking all the courses that over-lap with them and sorting them into the groups. Since all the classes in such a group must overlap not only with the standard time but with all the other times as well there are often conflicts.

Suffice to say, I have solved the problem through the resolution of conflicts. However, after the resolutions there are always going to be a number of classes which cannot be sorted into these groups. These "oddball" classes must then be sorted into groups of overlapping classes as well. The catch is this: it has to be done with as few groups as possible. So the problem is as follows:

Given a list of ranges, sort the ranges into groups in which all the ranges overlap, with as few resulting groups as possible.

I have come up with a couple solutions, but neither is quite as efficient as I would like. The first is to simply start with the first range, stick it in a group with as many courses as possible by adding the first course that overlaps with it, then the next that overlaps with both, and so on. The problem with this is that it must check every other range for each range, which isn't a big deal for the test-data I'm running in on, but were it to be run on the production data, there are often hundreds of unique time-ranges. The other thought was to check different times, and see which time had the most courses at once, create that group, then repeat. But this would prove even more inefficient, as I would have to run checks at 10-15 minute intervals for a 24-hour time for each course.

I've looked through several books on set-algorithms from the school library with no luck, so any advice or pointers would be most appreciated.
Thank you for your time,
Timothy

Ari Rahikkala · « **Reply #1 on:** March 12, 2011, 04:19:26 pm »

Offhand it seems reducible to a variation on the clique problem: With ranges as vertices, two ranges connected if they overlap, find the smallest set of cliques that's a decomposition of the graph. Thing is, that's not exactly useful to know since the clique problem is NP-complete and a damn hard one as NP-complete ones go, even. But then again this wouldn't be the first time someone accidentally reduces an easy problem to a hard one.

At least the graph theory connection might help you establish bounds for where a solution could be found. I don't know if you get the smallest set of cliques by recursively finding and removing the maximum clique of a graph, but if you do, then you'd at least know that while recursively taking the time with the most overlaps and forming groups based on it might be computationally infeasible, it does at least produce the right answer. But I'll let you do your graph theory reading-up for yourself

Normandy · « **Reply #2 on:** March 13, 2011, 12:39:18 am »

If I understand your problem correctly, you don't need to check each range against each other range. For each class n with time range T_n in the group, you want to see if the intersection T₁ ∩ T₂ ∩ T₃ ∩ ... of the group is non-empty (i.e. all of the classes overlap). Since intersections are associative, you can just keep a running intersection for each group. If each time range for a class overlaps with the running intersection, then it is guaranteed to intersect with all of the other time ranges in that group.

This might not solve all of your problems, however, as it doesn't remove the oddball classes, it just makes it easier to check if time ranges overlap.

timmeh · « **Reply #3 on:** March 13, 2011, 05:32:23 pm »

Thank you both very much for the input. I've got it working with one of the brute force methods at the moment, but will am scheduled to review the program with my boss tomorrow, and should get to look into optimizing it soon!

Eagleon · « **Reply #4 on:** March 15, 2011, 06:11:44 pm »

In solutions like this, you don't really need a huge amount of optimization so long as you're not doing something crazy unnecessary with the data. It's a one-run application, you're not looking at a real-time situation at all. Assuming you haven't already finished the thing, you should work on making it as useful as possible. Even if it does chug through a full operation in an hour, that's an hour well spent if it makes the university run more smoothly. After all, you're not paying the computer for the time.

alway · « **Reply #5 on:** March 15, 2011, 07:12:18 pm »

If you wanted to get really clever with the code, it might also compare grade level/departments courses are in. It would be better to schedule English 101 and English 401 at the same time than English 101 and Mathematics 101. Although that would also require additional input about the courses themselves which may or may not be easily accessible in your program.

timmeh · « **Reply #6 on:** March 17, 2011, 07:44:14 pm »

@Eagleon - This is a really good point, and something my boss pointed out to me as well. My current method works, but in testing I managed to break it (I mean, it always succeeds, it's just not always the ideal result), so I'm thinking I'm going to need to improve it further. The one catch here, is that the goal is to translate it to PHP and run it on the school's web-server. I have yet to talk with UITS (the technical division) about what that means as far as processing time and power, but it shouldn't be too much of an issue if, as you and my boss pointed out, it is only supposed to be run once per semester.

@Alway - The information will be pulled directly from the school database via an SQL query, which can be customized and expanded in the future. It's definitely something worth considering though, I'll look into the sort of changes that would have to be made to schedule the finals by section/course rather than by meeting time specifically...

Eagleon · « **Reply #7 on:** March 19, 2011, 03:29:14 am »

Be careful you don't assume too much about course loads. I knew at least a few people in my college that were taking multiple levels of English and History courses to get through prerequisites more quickly, after clearing it with professors. Your safest bet for these kinds of improvements I think would be to go by required courses, and if approval is allowed for bypassing prerequisites, keep potential conflicts a step away from each other - say, for instance, that Statics I requires Calculus 301, and Calc301 requires 201. Avoid conflicts between Calc301 and 201, but not Statics I and Calc201. This is assuming your college allows a lot of these shenanigans, which mine did - it was just on you to keep up. Of course, you could also check out if there's going to be conflicts by individual student schedules if you have access to that data, but that would be spooky and potentially controversial if it's not public information, hehe.

If you wanted to get really crafty with this stuff, you could go into graph theory, treating conflicts as edges between groups of classes, splitting the groups off into new groups/nodes as conflicts are found within themselves, joining similar nodes as time-slots fill up. That's just crazy though

timmeh · « **Reply #8 on:** March 20, 2011, 08:15:15 am »

Actually, I had considered a graph design, but my knowledge of data-structures besides lists and arrays, as well as and math above college algebra is rather lacking. The course load thing is a good point, I'm not entirely sure they enforce it at all. I didn't have credit for college algebra before taking (and getting credit for) Comp-sci 1, for which it is a prerequisite xP

As far as tailoring it to students... for one, it would be quite controversial, as not only is it not public information but I'm not suppose to have access to it, so it would require all sorts of... back-door-shenanigans on my part

For another though, it would shift from feeding information about several hundred meeting times, to information about up to 9,000-ish individual schedules

And while speed may not be the highest priority, I think they'd appreciate it if I didn't require that they divert all the computer power available to the school to generating the schedule for a week

Thanks for the input!

Rysith · « **Reply #9 on:** March 20, 2011, 12:33:21 pm »

Quote from: Ari Rahikkala on March 12, 2011, 04:19:26 pm

Offhand it seems reducible to a variation on the clique problem: With ranges as vertices, two ranges connected if they overlap, find the smallest set of cliques that's a decomposition of the graph. Thing is, that's not exactly useful to know since the clique problem is NP-complete and a damn hard one as NP-complete ones go, even. But then again this wouldn't be the first time someone accidentally reduces an easy problem to a hard one.

The fun thing about NP-complete problems is that they are perfectly reasonable to solve for reasonable N, and assuming that the college doesn't offer more than a few hundred courses it should be completely reasonable to solve that way, especially if you apply reasonable pruning to the solution set (rather than brute force) and the professors aren't scheduling classes to make an adversarial graph[1]. NP just means that your computation-per-N will go up quickly as N increases, and modern computers have quite a lot of computation that they can throw at a problem.

[1] This is something that people often forget about NP problems: Even if the always-correct algorithm is NP there are heuristic-based algorithms that will get within 0.5% of the optimal solution that are in P. This is also why we program for real computers rather than turing machines directly[2], even if they are computationally equivalent.

[2] Unless you are programming in brainfuck.

Bay 12 Games Forum

News:

Author Topic: Scheduling Algorithm (Read 1033 times)

timmeh

Scheduling Algorithm

Ari Rahikkala

Re: Scheduling Algorithm

Normandy

Re: Scheduling Algorithm

timmeh

Re: Scheduling Algorithm

Eagleon

Re: Scheduling Algorithm

alway

Re: Scheduling Algorithm

timmeh

Re: Scheduling Algorithm

Eagleon

Re: Scheduling Algorithm

timmeh

Re: Scheduling Algorithm

Rysith

Re: Scheduling Algorithm