[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PHPwestoz] [OT] MySQL UNIQUE INDEX



Mark Hendley wrote:

Hmm. Having thought about your issue a bit, I find I have a reservation
:)

If the FID column is used, as you indicate, to store a 'path' then it is
likely that the left-hand side (the first X characters) will contain a
lot of duplication. Because you can only create and index of the 1st X
(varies based on MySQL version) characters, this makes this approach
problematic.

You'd be better of having a column that is an MD5 hash of the full path
names. This will take a max of 32 characters CHAR(32) AND will be more
efficient to boot.

To do a search with a path aspect involved, just MD5 the source path you
are looking for and search for that MD5 string. Indexing should be
quicker as well. The database will also be smaller and, with the absence
of the TEXT column, quicker generally.

FYI - you'd have about a 1 in 100 million chance of a KEY collision
using this method. If critical, you could always do a search for an
existing MD5 KEY before you insert a new one (ie: Search for it) - this
will, at least, allow you a warning that a key collision may have
occurred.

I was already doing a sha1 hash on the file it's self as I need to be
able to keep track of files even after they move. I guess I should use a
CHAR(40) for this. Thanks

Tim

--
Tim White - Use the Fox, Luke!
PGP/GPG id: 602E944D, Pub Key Serv: subkeys.pgp.net
Fingerprint: 04C2 9682 B7B2 3006 009D  A9F3 067E EDCD 602E 944D
Hi! I'm a .signature virus! Copy me into your ~/.signature to help me spread!
--

Attachment: signature.asc
Description: OpenPGP digital signature