algorithm - How to code a URL shortener? -


i want create url shortener service can write long url input field , service shortens url "http://www.example.org/abcdef".

edit: due ongoing interest in topic, i've published efficient solution github, implementations javascript, php, python , java. add solutions if :)

instead of "abcdef" there can other string 6 characters containing a-z, a-z , 0-9. makes 56~57 billion possible strings.

my approach:

i have database table 3 columns:

  1. id, integer, auto-increment
  2. long, string, long url user entered
  3. short, string, shortened url (or 6 characters)

i insert long url table. select auto-increment value "id" , build hash of it. hash should inserted "short". sort of hash should build? hash algorithms md5 create long strings. don't use these algorithms, think. self-built algorithm work, too.

my idea:

for "http://www.google.de/" auto-increment id 239472. following steps:

short = ''; if divisible 2, add "a"+the result short if divisible 3, add "b"+the result short ... until have divisors a-z , a-z. 

that repeated until number isn't divisible more. think approach? have better idea?

i continue "convert number string" approach. realize proposed algorithm fails if id prime , greater 52.

theoretical background

you need bijective function f. necessary can find inverse function g('abc') = 123 f(123) = 'abc' function. means:

  • there must no x1, x2 (with x1 ≠ x2) make f(x1) = f(x2),
  • and every y must able find x f(x) = y.

how convert id shortened url

  1. think of alphabet want use. in case that's [a-za-z0-9]. contains 62 letters.
  2. take auto-generated, unique numerical key (the auto-incremented id of mysql table example).

    for example use 12510 (125 base of 10).

  3. now have convert 12510 x62 (base 62).

    12510 = 2×621 + 1×620 = [2,1]

    this requires use of integer division , modulo. pseudo-code example:

    digits = []  while num > 0   remainder = modulo(num, 62)   digits.push(remainder)   num = divide(num, 62)  digits = digits.reverse 

    now map indices 2 , 1 alphabet. how mapping (with array example) like:

    0  → 1  → b ... 25 → z ... 52 → 0 61 → 9 

    with 2 → c , 1 → b receive cb62 shortened url.

    http://shor.ty/cb 

how resolve shortened url initial id

the reverse easier. reverse lookup in alphabet.

  1. e9a62 resolved "4th, 61st, , 0th letter in alphabet".

    e9a62 = [4,61,0] = 4×622 + 61×621 + 0×620 = 1915810

  2. now find database-record where id = 19158 , redirect.

some implementations (provided commenters)


Comments

Popular posts from this blog

javascript - RequestAnimationFrame not working when exiting fullscreen switching space on Safari -

linux - phpmyadmin, neginx error.log - Check group www-data has read access and open_basedir -