algorithm - How to code a URL shortener? -
i want create url shortener service can write long url input field , service shortens url "http://www.example.org/abcdef
".
edit: due ongoing interest in topic, i've published efficient solution github, implementations javascript, php, python , java. add solutions if :)
instead of "abcdef
" there can other string 6 characters containing a-z, a-z , 0-9
. makes 56~57 billion possible strings.
my approach:
i have database table 3 columns:
- id, integer, auto-increment
- long, string, long url user entered
- short, string, shortened url (or 6 characters)
i insert long url table. select auto-increment value "id
" , build hash of it. hash should inserted "short
". sort of hash should build? hash algorithms md5 create long strings. don't use these algorithms, think. self-built algorithm work, too.
my idea:
for "http://www.google.de/
" auto-increment id 239472
. following steps:
short = ''; if divisible 2, add "a"+the result short if divisible 3, add "b"+the result short ... until have divisors a-z , a-z.
that repeated until number isn't divisible more. think approach? have better idea?
i continue "convert number string" approach. realize proposed algorithm fails if id prime , greater 52.
theoretical background
you need bijective function f. necessary can find inverse function g('abc') = 123 f(123) = 'abc' function. means:
- there must no x1, x2 (with x1 ≠ x2) make f(x1) = f(x2),
- and every y must able find x f(x) = y.
how convert id shortened url
- think of alphabet want use. in case that's
[a-za-z0-9]
. contains 62 letters. take auto-generated, unique numerical key (the auto-incremented
id
of mysql table example).for example use 12510 (125 base of 10).
now have convert 12510 x62 (base 62).
12510 = 2×621 + 1×620 =
[2,1]
this requires use of integer division , modulo. pseudo-code example:
digits = [] while num > 0 remainder = modulo(num, 62) digits.push(remainder) num = divide(num, 62) digits = digits.reverse
now map indices 2 , 1 alphabet. how mapping (with array example) like:
0 → 1 → b ... 25 → z ... 52 → 0 61 → 9
with 2 → c , 1 → b receive cb62 shortened url.
http://shor.ty/cb
how resolve shortened url initial id
the reverse easier. reverse lookup in alphabet.
e9a62 resolved "4th, 61st, , 0th letter in alphabet".
e9a62 =
[4,61,0]
= 4×622 + 61×621 + 0×620 = 1915810now find database-record
where id = 19158
, redirect.
Comments
Post a Comment