Friday, September 22, 2006


Rolling With Post’Em

A few of our instructors started trying to use Post’Em to put their grades online. When our testing center produces results from a scantron test, they send back a CSV file that is just perfect for Post’Em. Well, almost perfect. As it happens, our testing center uses its own unique identifier called a plid for students, different from their campus-wide single signon id (what we call their NetID). Post’Em expects to find NetIDs in the first column of the CSV file, and no instructor wants to manually correlate plids with NetIDs. That's the kind of grunt work computers are made for! This was a fun problem for me. Post’Em has some of the most readable and self-documenting code I've seen anywhere in Sakai. I already have a DataSource bean sitting in Spring's bean factory that has a connection to the SIS database I need to lookup a NetID based on a plid. All I had to do was add a new property for that bean to Post’Em's faces-config.xml under the relevant managed-bean. Over in, I added a method to find a NetID for a given plid. The only tricky part was figuring out where in the data stream a plid should transform into a NetID. Now if we could only convince the testing center to upload the CSV where they would normally email it to the instructor, we would have some killer integration. By the way, finding ways to open channels of data between Sakai and all the little data silos around campus is how we make an indispensable information system.

Monday, September 18, 2006


I Heart Unix

I just recently figured out how to use svn propset. At last I can have proper svn:ignore properties for those hundreds of 'target' directories in my source folder. The trouble is, svn doesn't allow you to set a property like svn:ignore 'target' recursively. At last count (ok, it was two minutes ago) that would be 334 directories I would need to update. That sounds insane, right? Unix to the rescue. for dir in `find . -name 'project.xml' -execdir pwd \;`;do svn propset svn:ignore 'target' $dir;done It takes longer to write about it than to do it. :-)


Poor Little Database

You'd think I would learn about sub-selects in MySQL. I just paralyzed our production database for 10 minutes with this query: select PROVIDER_ID from SAKAI_REALM where REALM_ID in (select CONCAT('/site/', SITE_ID) from SAKAI_SITE_TOOL where REGISTRATION = 'sakai.assignment'); I didn't realize anything was wrong until my instant messenger started lighting up like a switchboard. I used show processlist and killed the offending query. This is yet more evidence that a software engineer should not be permitted to touch the database. On the other hand, why should I be able to cripple our application with a query that MySQL doesn't happen to like?

Wednesday, September 13, 2006


Enterprise id is killing me

Will there be no end to the confusion of the Id vs. the Eid? I need more than two hands to count the number of ways this has tripped me up. In the old days, Sakai only used one id. As of Sakai 2.2.0, there's Sakai's internal id, and the institution's id, or enterprise id (eid). Some parts of Sakai's API want an id, other parts want an eid. Usually anything that interacts with your providers (i.e., your enterprise) should get an eid. This seems perfectly reasonable, but it's not as cut and dried as it seems. For instance, CourseManagementService.getInstructorCourses(instructorId) seems like it would take an eid, because you're going to get your courses from your student information system, which only knows about the enterprise id. It turns out though, that CourseManagementService expects a Sakai id, and then does an eid lookup inside the getInstructorCourses method before delegating to the CourseManagementProvider to get the info out. You don't always know right away when you have a problem in this regard, because many users have a Sakai id that is equal to their eid (this turns out to be any user who was in your database at the time you did the conversion to 2.2.0). For these users, all the code that takes any kind of id still works even if you're sending the wrong id. Then gradually your system fills up with new users with a Sakai-generated guid and then things really get “interesting.” Your brain really starts to hurt when you try to make the right list of participants show up in the Site Info tool, because any participants who came to your course from the CourseManagementProvider are represented by an object of type CourseMember and any participants who were manually added by site maintainers are represented by an object of type Participant. The CourseMember has an Id field and Uniqname field. The Participant has an Eid field, a Uniqname field, and a DisplayId field. So what is Uniqname? Is it an Id? Is it an Eid? Turns out it's an Id, and woe betide you if you stuff an Eid in there. So now you can be either a participant or a course member, and you can either have a guid for an id or an eid for an id, and every combination can produce different results. I'm not even going to touch DisplayId. This is all just a long-winded way of saying the dreaded Eid hit me twice again today. I have a homegrown tool for staff to use to associate rosters with instructors. It broke because it was passing an eid to the CourseManagementService. Also, I had to ask for a bug I reported, SAK-6222, to be retracted because I thought SiteAction was misusing the CourseManagementService when it really wasn't.


Finally TRACS > Sakai 2.2.0

It was a big update and our servers are very busy, so it took a while to make sure everything was ok, but we're finally tracking very closely to the 2.2.x maintenance branch. We got bit by SAK-6128, vanishing Syllabi. I wonder why a change to the Syllabus DDL shows up in a maintenance branch. Thank goodness for Stephen Marquard.


Snapz Pro X 2.0.3

This only has a little bit to do with Sakai. Ambrosia finally put out a Universal Binary of their screen capture utility for the mac. I didn't really want to run something like that in Rosetta. Find it here. Update: I've been hoodwinked! It is not in fact a Universal Binary, it's “Intel compatible,” which just means it will run, though it runs under Rosetta. *sigh*

Monday, September 11, 2006


the 700 club

A couple of posts ago, I pointed out that we wouldn't be able to keep measuring adoption rate by the number of unique visitors in a semester, because once that number is roughly equal to the number of potential visitors, it doesn't work as a gauge anymore. Concurrent sessions is a number that’s going to have more legs. Before the start of this semester, we had never cracked 100. My eyes bugged out at 400, but it continues to slowly rise. We hit 700 today for the first time. The JVMs have become a lot more stable, but not immune from crashing. I guess the faulty network hardware was causing many of them. I am in the dark as to what factors remain dangerous to our JVM.

Friday, September 08, 2006


We Welcome Your Comments

The inimitable Dave Ross pointed out to me that my blog required authentication for comments. I've changed the settings so you can now comment anonymously. I had to ask Dave why he didn't want to identify himself. “No, just too lazy to get a blogger account to post a little note.” Got it. Done and done. Have at it!



With a simple query, you can tell how many unique users started a session with Sakai, and you can constrain it to a particular timeframe. select count(1) from (SELECT DISTINCT SESSION_USER from SAKAI_SESSION where SESSION_START > '2006-08-23') as num_users; We have used this to give us an idea of usage since we started our pilot in August of last year. The numbers are telling: By the end of the calendar year 2005 we had around 1,000 users. Between January and May of this year, we had just shy of 4,000. Since the first day of class just over three weeks ago, 13,580. In my talk in Vancouver about migration, I said “Beware. You might succeed.” Luckily, we only have ~40,000 people in the whole campus community, so we won't be able to quadruple again. Say, I could go for some fresh air and a tall glass of lemonade.

Wednesday, September 06, 2006


Close, But No Cigar

I deployed a big update yesterday that included all the changes in the 2-2-x branch since 2.2.0 along with a few query optimizations of my own. It included fixes for bugs that were bugging us, like the inability to add students to groups and a stack overflow error in the Section Info tool. I had given this code a cursory look on our staging server and it looked good, so I ran with it. Unfortunately, I missed the fact that participants in a multi-roster course did not have access to their course. I got the call this morning while I was shaving. There's nothing like troubleshooting code in a bath towel. I immediately rolled back to yesterday's code. My plan for today was to figure out what had been wrong with the new release, but I spent much of the day fixing two new problems: site setup stopped adding rosters to course sites when they were created, and certain users couldn't update their site's rosters after the fact. The first problem was a result of using an updated version of without also updating the CourseManagementProvider it depends on. The second problem was a result of calling the CourseManagementService with an Id, instead of an Eid. I decided to jira that one. Sadly, I'm no closer to my awesome update than I was this morning. I do have a hunch about what was going wrong though. I think my new method in the CourseManagementProvider may not be right. This time I'll be sure before it goes into production. I keep meaning to create a whole suite of Selenium tests. Perhaps someday soon!

Saturday, September 02, 2006


Chasing a Race Condition

For a long time, we've had an unsolved mystery: every once in a great while, some combination of permissions, participants, and roles will just vanish from somebody's site. I always chalked it up to something I did wrong in our AuthzGroupProvider, but now that our traffic is way up, the problem is worse and we have much better information about what's going on. Jeff started keeping a log of every query to MySQL. When the problem reared its ugly head again, we started scanning the logs for anything that does a delete from any of the SAKAI_REALM tables. It was easy to find: edit) deletes every participant, role and permission for a given site before building them all back up again from scratch. We think the problem is insufficient isolation. That is, one thread gets a dirty read of the SAKAI_REALM tables while the save() operation in another thread is still in the middle of reconstituting the data. This problem is exacerbated in MySQL by the fact that all the delete statements use a sub-select, something that performs very poorly on MySQL. I'm betting the reason the institutions using Oracle don't have a problem is that a) the isolation is right, and b) the queries run really fast. I'll have more on this after we've played around with a few exploratory scenarios.

This page is powered by Blogger. Isn't yours?