- Time to Go Biodegradeable?: Sydney, Australia--Oceanographic scientists say they have discovered a vast, floating "reef" of the world's disposed condoms in the middle of the South Pacific, about halfway between Tahiti and Antarctica. The phenomenal mass is almost two miles long, an (weird)
- Andy Budd::Blogography: 10 Bad Project Warning Signs: "One of the great things about being a freelance web designer is the ability to turn down projects. I’ve come across a few projects recently that sounded interesting but made me feel nervous." (design web)
- Common Errors in English: Also available in Dead Tree format (funny howto reference tool writing)
Tuesday, May 31, 2005
del.icio.us links for 2005-05-31
Monday, May 30, 2005
del.icio.us links for 2005-05-30
- AMAZING X-RAY EFFECT: Clever CSS trick (css image mozilla xhtml)
- The Cyborg Name Generator: Filthy humans: Determine your true cyborg name with this automated process (humor)
- XHTML2: Accessible, Usable, Device Independent and Semantic: XHTML2 is the next version of the XHTML family, and is going to last call Real Soon Now. This presentation gives an overview of what XHTML2 is trying to achieve. (html w3c web xhtml xml)
Tuesday, May 24, 2005
del.icio.us links for 2005-05-24
- New Attack Can Recover Complete AES Keys: Daniel Bernstein, an associate professor at the University of Illinois at Chicago, recently released a paper showing how an attack against a server running the OpenSSL AES implementation could recover the entire encryption key. (cryptography djb privacy security)
Monday, May 23, 2005
del.icio.us links for 2005-05-23
- How to Make a RJ45 Cable Tester: Includes general information on RJ45 cable standards for Ethernet (ethernet howto network rj45)
Thursday, May 19, 2005
DirecWay: Broadband provider of last resort
Thursday, May 12, 2005
del.icio.us links for 2005-05-12
- dm-crypt - a device-mapper crypto target: a device-mapper target that provides transparent encryption of block devices using the new Linux 2.6 cryptoapi. (cryptography linux security)
Wednesday, May 11, 2005
del.icio.us links for 2005-05-11
- Flickr: Photos from [c21de] Vintage Images: These chicks were hot... 50 years ago. Vintage girly-mag cover gallery [via BoingBoing] (art image photo)
Tuesday, May 10, 2005
del.icio.us links for 2005-05-10
- Dive Into Greasemonkey: Greasemonkey is a Firefox extension that allows you to write scripts that alter the web pages you visit. (extension firefox greasemonkey javascript)
Monday, May 09, 2005
del.icio.us links for 2005-05-09
- How to Perform Strong Man Stunts by Ottley R. Coulter: All the World Loves a Strong Man (howto)
Saturday, May 07, 2005
Friday, May 06, 2005
del.icio.us links for 2005-05-06
- FAQ: How Real ID will affect you | CNET News.com: Starting three years from now, if you live or work in the United States, you'll need a federally approved ID card to travel on an airplane, open a bank account, collect Social Security payments, or take advantage of nearly any government service. (liberty politics privacy)
Wednesday, May 04, 2005
del.icio.us links for 2005-05-04
- ITworld.com - Naked programming on naked street: Not as exciting as it sounds: Static and dynamic typing compared traffic controls and the lack thereof. (programming python)
Saturday, April 30, 2005
Friday, April 29, 2005
del.icio.us links for 2005-04-29
- How to Create a Frames Layout with CSS - WebReference.com: The following article will detail how to set up a 'frame' style layout with a fixed header, which can incorporate the navigation, a fixed footer and a scrolling content area, all of which will resize down to virtually nothing and still be usable (with scr (css html layout web)
Thursday, April 28, 2005
del.icio.us links for 2005-04-28
- Aardvark Firefox Extension: Allows you to resize or remove elements on a page loaded in the browser (extension firefox html tool)
- The Ghost That Feeds: Ray Parker Jr's "Ghostbusters" vs. Nine Inch Nails "The Hand That Feeds" (funny mashup music)
- Rate my Boob Job: Vote on Breast Implant pictures (boobs nsfw photo)
Wednesday, April 27, 2005
del.icio.us links for 2005-04-27
- Juicy Studio: Readability Test: Calculate readability of a website or document by several algorithms: Gunning-Fog, Flesch Reading Ease, Flesch-Kincaid (tool writing)
Tuesday, April 26, 2005
del.icio.us links for 2005-04-26
- Free Mag 7 Star Charts: free, downloadable set of high-quality star charts capable of being printed at reasonable resolutions on the average home printer. (astronomy free maps pdf)
Saturday, April 23, 2005
MySQL UC: Day 4
It's day 4 at the MySQL Users Conference. It's also the day I present.
Three Approaches to MySQL Applications
I went to this for one reason only: It was the presentation immediately preceding mine, so I was guaranteed to be on time and not have any surprises about the room.
The presenters were from Dell, and their demo app was a DVD sale system using MySQL running a (surprise!) Dell server (dual Xeons at 3.06 GHz). The three approaches were PHP, JSP, and ASP.net. I was really, really glad to see this as the presentation before mine, because the PHP database code was such a mess. In PHP, you have to piece together your own literal query, typically via string concatenation and $ substitution, and then call functions that are different for each database. If you want to even try to be database-independent, you need to add an abstration layer of some sort. I've done this before with PHP, implementing something like Python's DB-API, and having subclasses for both MySQL and PostgreSQL, but this was not done here. Worse, I am pretty sure that many of their PHP code examples were susceptible to SQL injection attacks. In short, it makes Python look damn good, which it is.
Python and MySQL
My presentation is now available on-line. I was happy with the way it turned out. We had 45 minutes, with the last five minutes for questions. I had 37 slides, and I managed to finish with seven minutes for questions, and ended up getting 10-15 minutes of questions. It was not a huge crowd -- less than 30 people I think -- but it was also the last day of the conference, with checkout time at noon, and I don't think I had anyone leave before the question session. People I talked with afterwards seemed to think it was useful, at least, including people that were already MySQLdb users, and the presentation was intended primarily for non-users.
Lunch
Make your own sandwich day.
Writing Storage Engines for MySQL
This was a more focused version of the "Tour of the Source" presentation, with more details on creating new column types and storage engines.
There are quite a few MySQL storage engines out there; in fact, a lot more than you might think:
- ISAM. This is the original MySQL storage engine, before there were storage engines, and in fact it is now removed from 5.0.
- MyISAM. This is the replacement for ISAM, around since 3.23. It's not transactional -- though it is planned to eventually make it transactional -- but very fast if you have to append new rows.
- MERGE. This is, in a way, a limited sort of view. You can define a MERGE table to be the UNION of several identical (schema) MyISAM tables. For example, if you are recording log data or audit trails, you could segment your data into seperate tables for each month, and then use a MERGE table to treat them like one.
- BDB. This is the Sleepycat Berkeley DB. BDB itself is not a relational database but a hashed-key database which is transactional.
- InnoDB. InnoDB has been around for a long time, supposedly more than 10 years. It provides Oracle-style multiversioned concurrency control.
- Archive. This is an engine that was developed by Yahoo! for logging. It supports SELECT and INSERT, and compresses records as they are inserted.
- CSV. This does what you would think: Each table is stored in a CSV (comma-separated value) file, which is suitable for importing into lots of programs, especially spreadsheets.
- FEDERATED. This engine actually acts as a proxy to another database server. At the moment, the remote server must be a MySQL server, but other servers are expected to be supported in the future. Transactions are not supported, but SELECT, UPDATE, DELETE, and INSERT are. In theory, it seems like transactions could be supported, and the documentation seems to hint that it could be at some point. Referential integrity can't be guaranteed, since the tables could change on the remote server without the local server being informed about it.
- MEMORY. Also known as HEAP tables, these only exist in RAM and are never written to disk. They are used internally sometimes for JOINs and views, but can be used for other temporary uses. The schema is persistent; only the data is volatile.
- BLACKHOLE. All SQL operations are supported, but they do nothing. Data written to BLACKHOLE tables is discarded; SELECT returns no rows. These tables are supposed to be good for replicating the schema but not the data.
- EXAMPLE. This doesn't do anything useful, except provide a skeleton example of how to make your own storage engine.
Post-Conference
The conference was wrapped up by 5 p.m. PDT. Since I had some time on my hands, I checked the train schedules, and decided to make a quick trip to San Francisco. This involved taking the VTA train/trolley to Mountain View and catching Caltrain to San Francisco: $6 for both trains. Once there, I got on the 30 bus and went to Chinatown. Had dinner at Cafe Honolulu. Walked up-hill a block and rode the cable car down to Fisherman's Wharf. By this time, it was dark, but you still get a good view of the city. From Fisherman's Wharf, you can see the Golden Gate bridge.
I had figured out from the train schedules that the last Caltrain train I could take and still make the last VTA train out of Mountain View left at 10:07 pm PDT. My other options were: a) take the 12:07 a.m. PDT train to Mountain View and catch a taxi back; and b) take a train that left after 4 a.m. and wander around SF for six hours. In the end, I decided to take the cable car to the other end of the line. This put me close to 4th Street, which was one of the cross streets at the Caltrain station. I missed one bus and decided to walk there, which turned out to be a pretty good idea, since I got there with about 10 minutes to spare and was not passed by any more busses. I thought it would be about six blocks, but it was more like 1.3 miles. Then took Caltrain to Mountain View, and waited on the train about half an hour before it left. By this point I was nearly falling asleep, so skipping town early was probably the right choice. I figured out that to catch my 1:03 p.m. flight the next day, I should take the VTA train leaving around 10:30 a.m., and that would take me back to the San Jose airport. Got back to my hotel after midnight; set the alarm for 8:30 a.m. and sent to sleep.
8:30 a.m. arrives. Hit snooze.
8:37 a.m. or so arrives. Turned off alarm.
10:00 a.m. arrives. Now I am wide-awake. Pack in a frenzy. Checked out about 10:25 a.m. Made it to train station in time. Had to change trains and then catch the airport flyer bus to the airport, which turns out to be only about other mile away. Checked in. Got breakfast burrito at Señor Jalopeño. Ate a pastry I had gotten from Cafe Honolulu the night before in SF and put in my coat pocket, just in case. Flight to Atlanta and airport shuttle to Athens were uneventful, in comparison.
Wednesday, April 20, 2005
MySQL UC: Day 3
It's day 3 at the MySQL Users Conference.
Hotel Hopping
I checked out of the Four-Points Sheridan in Sunnyvale and into the Santa Clara Westin (the conference hotel). I had planned to stay at the Westin the entire trip, but they were booked up the first few days of the conference. I definitely wanted to be there the night before my own presentation. Travelling back and forth from the Sheridan cost me at least an hour every day.
Enterprise MySQL: Views in MySQL 5.0
This was similar to yesterday's Flagship Features in MySQL 5.0, but focused exclusively on views.
In general (i.e. not just MySQL, but relational databases), not all views are updatable; some cannot be updated. Whether or not a view is updatable seems to depend mostly upon any JOIN condition that may be present in the underlying query. Aggregate functions and derived columns will also make a view (or at least the relevant columns) non-updatable. MySQL-5.0 does not support all theoretically-possible updatable views, but it supports a lot. The theoretically-possible ones it does not are mainly due to implementation or performance issues. Some of the exceptions are listed in the CREATE VIEW manual section.
When they say a view is updatable, they mean you can use UPDATE, INSERT, and DELETE.
Some new VIEW-related privileges were added. In particular, you can restrict examining the view's schema, while still allowing access. For access, you need SELECT, UPDATE, etc. on the view itself and not the underlying tables.
You can enforce constraints on views locally, or have them cascade if you have a view that is based on other views.
MySQL Security
This was an overview of best practices, and what you should do to protect your server. The Windows-only worm that was active for one day a few months back was taking advantage of three security issues. The first was that the Windows installation (unlike the default UNIX installation) was adding access for root@%. The second: A lot of admins were choosing crappy passwords. The initial worm had a small dictionary of passwords to try, like "abc123", and would simply try everything. One user there said the list got longer as time went on, which seems entirely possible, since it would connect to IRC servers and websites. The final problem: A lot of admins were using a system account to run the server, which effectively gave it root.
The worm would log in as root, and create a table with a BLOB column. Then it would INSERT a record containing the worm payload (a DLL), and then use SELECT INTO DUMPFILE to write it to the disk. Finally it would use CREATE FUNCTION to load the DLL. To midigate the problem, mysqld now will not load a shared library for a user-defined function (UDF) unless it implements the required API calls. This is not much protection, since the attacker can add these to the payload, but it would thwart a naïve worm implementation. They also removed the root@% account, or at least no longer install it, and seem to more strongly encourage running mysqld with an unprivileged account.
At least one person voiced support for adding group/role support to the privilege system, so that you could configure privileges for a few roles and then assign multiple users to that role, or to multiple roles. The presenter was fully in agreement, saying that this was something he also wanted badly. Afterward, we had a brief discussion of how something like this might be implemented now, before there is official support. I had two separate ideas. One was to add a column to the user privilege table that would indicate whether or not this was a real user or a role, the difference being that a role account could not log in, and was simply a prototype for other users with that role. Then have a separate table that would associate users with multiple roles. The actual user privileges would be manipulated outside of mysqld by some script. The other idea was use a view for the user table which would somehow consolidate all the privileges for the roles that user was a member of.
Lunch
Italian day. Pizza, pasta, etc.
Tour of the MySQL Source Code
This was a broad overview of how the MySQL source code is organized, with some information on adding UDFs and storage engines.
There already exists some UDFs to run perl and PHP, and Brian Aker told me there was one for Python, since he was the one who originally wrote it. However, my Google searches were turning up nothing relevant, so I spent the rest of the afternoon writing one. I've gotten it to work, except that there is a dynamic linking problem which causes imports of extension modules to fail: It can't find symbols that are definitely in there from libpython. I can probably fix this with the linker flags, but I ran out of time to work on it.
One of the quirks/features of the UDF API is, if you want to have a function foo, you need to define foo_init(), foo(), and maybe foo_deinit(). foo_init() is supposed to do all the initialization and argument checking. If there are any problems with the arguments, you have to report it here; foo() cannot return an error code. Checking the arguments is not really practical with Python; it can be done with introspection but would be expensive. If an exception gets raised, the only real option is to return NULL. It does send tracebacks to the log, though. foo_deinit() is supposed to clean up, but is only called if foo_init() was sucessful. There is a separate set of API functions to define for aggregate functions.
Tuesday, April 19, 2005
MySQL UC: Day 2
It's day 2 at the MySQL Users Conference.
LiveJournal's Backend: A History of Scaling
If my backend had a history of scaling, I'd be pretty upset.I got here about 5 minutes late, and it was standing room only.
In short, they do a lot of caching on the front end. On the database side, they split databases across servers in some cases. They also do some master-slave replication. Another setup is to use InnoDB with the database on a remote filesystem, and if the master goes down, the backup master mounts the filesystem and recovers. They hate the Linux NFS implementation (probably with good reason), and so I suspect they use some type of NAS like iSCSI for this; it wasn't clear to me. They also do a lot of perl.Other advice: Use InnoDB. Despite some early InnoDB problems, it works quite well for them. The exception to this is logging, in which case the advice is: Use MyISAM. It's not that InnoDB is bad, but that MyISAM is so good at appending rows.
Apparently InnoDB is about 15 years old, according to one of the presenters; it's just the MySQL support that is relatively new.Flagship Features in MySQL 5.0
MySQL-5.0 (5.0.4 is the second beta, just released) has support for views, stored procedures, triggers, and an information schema. The most interesting thing about views is that they are updatable in most cases. With PostgreSQL, it appears you do not have directly updatable views; you have to write update rules.
Stored procedures follow the SQL 2003 standard. Apparently IBM DB2 is the only other DBMS which satisfies the standard.
The information schema looks like a database, but it is virtual. It's also an SQL standard for getting metadata about your tables. According to Monty, this is not currently implemented as a storage engine, as you might expected, because there some of the hooks that would be needed don't exist yet. I think he said this was one of Brian Aker's projects.
Lunch
At least O'Reilly feeds you well at these conferences. Day 1 was Asian-themed: Thai noodle salad, stir-fried vegetables, teriaki chicken, hot and sour soup, fried rice, spring rolls. Day 2 was Mexican-themed: Field green salad with jalapeno vinegrette dressing, fajita/taco fixings, black bean soup. OSCON 2002 (the last in San Diego) also had good lunches, as I recall. OSCON 2003 had a brown bag lunch, which was provided by Microsoft. My comment at the time was that Microsoft should get used to us eating their lunch.
PyCon 2005 furnished box lunches that, while quite good, many people had to sit in the hallways to eat. However, PyCon was only $250 for a three day conference, while the early-bird price for MySQL UC is $895 for the conference without tutorials. PyCon doesn't pay speakers, but then neither does MySQL UC, except for tutorials.
Distributed Transactions With MySQL XA
I had trouble finding the room and got there about 10 minutes late.
MySQL-5.0.4 can act as a transaction manager for performing distributed transactions. What this means is that there is a two-stage commit (prepare and commit), and once one of the servers in the transaction successfully prepares, it guarantees that it can commit or rollback. If the TM crashes, it can recover open transactions and either commit them or roll them back. This can take an indefinite amount of time, or there can be timeouts.
An example of how you could use this is writing to multiple databases -- perhaps multiple MySQL servers or MySQL and Oracle and ZODB, etc. -- and ensuring that all servers involved either commit entirely or rollback entirely.
MySQL Cluster Features and Roadmap
Clustering first showed up in early 4.1, though it had some serious limitations. Some of these limitations are removed in 4.1.10a, and more are removed in 5.0.4.
In the 4.1 series, if you do a SELECT with a WHERE clause, the MySQL server has to fetch all rows from the NDB nodes and then filter them based on the WHERE condition. In 5.0, the WHERE clause is evaluated by the NDB nodes, which can greatly reduce network traffic between the nodes and the server.
Support for BLOBs was added in 4.1.10a, but NDB is really not intended for large BLOBs. In 5.0.4 it is still an in-memory database, which is checkpointed to disk periodically.
In 4.1, NDB tables cannot use the query cache. The query cache is a function of the MySQL server, not the NDB nodes. In 5.0.4, the storage engine API has apparently been improved so that the NDB nodes can invalidate the cache in the MySQL server(s).
Replication does work with clustering, but it's not perfect. Since you can have multiple MySQL servers as a front-end to a cluster of NDB nodes, the server being replicated doesn't necessarily see all transactions, so they don't all get replicated. However, if you had a single read-write server and the rest read-only, you could replicate the data to another server which used a different cluster; the example given was for west and east coast clusters, with west replicating to east. This is in addition to the normal replication features of the cluster. If you want a disk analogy, this would be like RAID-110: A mirrored array of two RAID-10 (striped and mirrored) arrays.
Free as in Food and Beer
Post-sessions there was free food, beer, and wine: Bruschetta, stuffed mushrooms, tiny quiches, beef/chicken kebobs, cheese, bread. Decent beer: several varieties including Sam Adams.
Happy happy joy joy
I built a new kernel last night (gentoo-sources-2.6.11-r6) and as a result, when I resume after suspending to disk, my PCMCIA and USB work. This used to work and broke at some point. It's really nice to be able to suspend and resume and still have net and trackball afterwards. In my suspend script, I shutdown PCMCIA and USB and then remove the modules, and afterwards restart PCMCIA and USB. I don't know if that's still strictly necessary, but I don't feel like breaking it again just yet.