admin管理员组

文章数量:1430575

Given a uuid(v4) without dashes, how can I shorten it to a 15 or less than 15 characters string? I should also be able to go back to the original uuid from the 15 characters string.

I am trying to shorten it to send it in a flat file and the file format specifies this field to be a 15 characters alphanumeric field. Given that shortened uuid, I should be able to map it back to the original uuid.

Here is what I tried, but definitely not what I wanted.

export function shortenUUID(uuidToShorten: string, length: number) {
  const uuidWithoutDashes = uuidToShorten.replace(/-/g , '');
  const radix = uuidWithoutDashes.length;
  const randomId = [];

  for (let i = 0; i < length; i++) {
    randomId[i] = uuidWithoutDashes[ 0 | Math.random() * radix];
  }
  return randomId.join('');
}

Given a uuid(v4) without dashes, how can I shorten it to a 15 or less than 15 characters string? I should also be able to go back to the original uuid from the 15 characters string.

I am trying to shorten it to send it in a flat file and the file format specifies this field to be a 15 characters alphanumeric field. Given that shortened uuid, I should be able to map it back to the original uuid.

Here is what I tried, but definitely not what I wanted.

export function shortenUUID(uuidToShorten: string, length: number) {
  const uuidWithoutDashes = uuidToShorten.replace(/-/g , '');
  const radix = uuidWithoutDashes.length;
  const randomId = [];

  for (let i = 0; i < length; i++) {
    randomId[i] = uuidWithoutDashes[ 0 | Math.random() * radix];
  }
  return randomId.join('');
}
Share Improve this question edited Dec 20, 2017 at 22:02 msanford 12.3k13 gold badges71 silver badges98 bronze badges asked Dec 20, 2017 at 21:21 RKGRKG 6121 gold badge9 silver badges24 bronze badges 9
  • 3 What have you already tried? – Liam Commented Dec 20, 2017 at 21:24
  • Sorry, my bad. I should have added what I was doing. That was totally useless and hence didn't add that. – RKG Commented Dec 20, 2017 at 21:34
  • I really don't understand what you are trying to do. Your algorithm is non-deterministic (i.e., Math.random()), so every time you run shortenUUID() on the same input UUID you would generate a different shortened version. How do you expect to do a reverse mapping that way? What is the end goal of making a shorter UUID? There may be a pletely different solution to your problem. – msanford Commented Dec 20, 2017 at 21:49
  • 1 Exactly my point and that was why I mentioned, what I did was useless. – RKG Commented Dec 20, 2017 at 21:52
  • 1 @msanford I am trying to shorten it to send it in a flat file and the file format specifies this field to be a 15 characters alphanumeric field. Given, that shortened uuid, I should be able to map it back to the original uuid. – RKG Commented Dec 20, 2017 at 21:57
 |  Show 4 more ments

1 Answer 1

Reset to default 4

As AuxTaco pointed out, if you actually mean "alphanumeric" as in it matches "/^[A-Za-z0-9]{0,15}/" (giving the number of bits of 26 + 26 + 10 = 62), then it is really impossible. You can't fit 3 gallons of water in a gallon bucket without losing something. A UUID is 128-bits, so to convert that to a character space of 62, you'd need at least 22 characters (log[base 62](2^128) == ~22).

If you are more flexible on your charset and just need it 15 unicode characters you can put in a text document, then my answer will help.


Note: First part of this answer, I thought it said length of 16, not 15. The simpler answer won't work. The more plex version below still will.


In order to do so, you'd to use some kind of two-way pression algorithm (similar to an algorithm that is used for zipping files).

However, the problem with trying to press something like a UUID is you'd probably have lots of collisions.

A UUID v4 is 32 characters long (without dashes). It's hexadecimal, so it's character space is 16 characters (0123456789ABCDEF)

That gives you a number of possible binations of 16^32, approximately 3.4028237e+38 or 340,282,370,000,000,000,000,000,000,000,000,000,000. To make it recoverable after pression, you'd have to make sure you don't have any collisions (i.e., no 2 UUIDs turn into the same value). That's a lot of possible values (which is exactly why we use that many for UUID, the chance of 2 random UUIDs is only 1 out of that number big number).

To crunch that many possibilities to 16 characters, you'd have to have at least as many possible values. With 16 characters, you'd have to have 256 characters (root 16 of that big number, 256^16 == 16^32`). That's assuming you have an algorithm that'd never create a collision.

One way to ensure you never have collisions would be to convert it from a base-16 number to a base-256 number. That would give you a 1-to-1 relation, ensuring no collisions and making it perfectly reversible. Normally, switching bases is easy in JavaScript: parseInt(someStr, radix).toString(otherRadix) (e.g., parseInt('00FF', 16).toString(20). Unfortunately, JavaScript only does up to a radix of 36, so we'll have to do the conversion ourselves.

The catch with such a large base is representing it. You could arbitrarily pick 256 different characters, throw them in a string, and use that for a manual conversion. However, I don't think there are 256 different symbols on a standard US keyboard, even if you treat upper and lowercase as different glyphs.

A simpler solution would be to just use arbitrary character codes from 0 to 255 with String.fromCharCode().

Another small catch is if we tried to treat that all as one big number, we'd have issues because it's a really big number and JavaScript can't properly represent it exactly.

Instead of that, since we already have hexadecimal, we can just split it into pairs of decimals, convert those, then spit them out. 32 hexadecimal digits = 16 pairs, so that'll (coincidentally) be perfect. (If you had to solve this for an arbitrary size, you'd have to do some extra math and converting to split the number into pieces, convert, then reassemble.)

const uuid = '1234567890ABCDEF1234567890ABCDEF';
const letters = uuid.match(/.{2}/g).map(pair => String.fromCharCode(parseInt(pair, 16)));
const str = letters.join('');
console.log(str);

Note that there are some random characters in there, because not every char code maps to a "normal" symbol. If what you are sending to can't handle them, you'll instead need to go with the array approach: find 256 characters it can handle, make an array of them, and instead of String.fromCharCode(num), use charset[num].

To convert it back, you would just do the reverse: get the char code, convert to hex, add them together:

const uuid = '1234567890ABCDEF1234567890ABCDEF';

const press = uuid => 
  uuid.match(/.{2}/g).map(pair => String.fromCharCode(parseInt(pair, 16))).join('');
  
const expand = str =>
  str.split('').map(letter => ('0' + letter.charCodeAt(0).toString(16)).substr(-2)).join('');
  
const str = press(uuid);
const original = expand(str);

console.log(str, original, original.toUpperCase() === uuid.toUpperCase());


For fun, here is how you could do it for any arbitrary input base and output base.

This code is a bit messy because it is really expanded to make it more self-explanatory, but it basically does what I described above.

Since JavaScript doesn't have an infinite level of precision, if you end up converting a really big number, (one that looks like 2.00000000e+10), every number not shown after that e was essentially chopped off and replaced with a zero. To account for that, you'll have to break it up in some way.

In the code below, there is a "simple" way which doesn't account for this, so only works for smaller strings, and then a proper way which breaks it up. I chose a simple, yet somewhat inefficient, approach of just breaking up the string based on how many digits it gets turned into. This isn't the best way (since math doesn't really work like that), but it does the trick (at the cost of needed a smaller charset).

You could imploy a smarter splitting mechanism if you really needed to keep your charset size to a minimum.

const smallStr = '1234';
const str = '1234567890ABCDEF1234567890ABCDEF';
const hexCharset = '0123456789ABCDEF'; // could also be an array
const pressedLength = 16;
const maxDigits = 16; // this may be a bit browser specific. You can make it smaller to be safer.

const logBaseN = (num, n) => Math.log(num) / Math.log(n);
const nthRoot = (num, n) => Math.pow(num, 1/n);
const digitsInNumber = num => Math.log(num) * Math.LOG10E + 1 | 0;
const partitionString = (str, numPartitions) => {
  const partsSize = Math.ceil(str.length / numPartitions);
  let partitions = [];
  for (let i = 0; i < numPartitions; i++) {
    partitions.push(str.substr(i * partsSize, partsSize));
  }
  return partitions;
}

console.log('logBaseN test:', logBaseN(256, 16) === 2);
console.log('nthRoot test:', nthRoot(256, 2) === 16);
console.log('partitionString test:', partitionString('ABCDEFG', 3));

// charset.length should equal radix
const toDecimalFromCharset = (str, charset) => 
    str.split('')
      .reverse()
      .map((char, index) => charset.indexOf(char) * Math.pow(charset.length, index))
      .reduce((sum, num) => (sum + num), 0);
      
const fromDecimalToCharset = (dec, charset) => {
  const radix = charset.length;
  let str = '';
  
  for (let i = Math.ceil(logBaseN(dec + 1, radix)) - 1; i >= 0; i--) {
    const part = Math.floor(dec / Math.pow(radix, i));
    dec -= part * Math.pow(radix, i);
    str += charset[part];
  }
  
  return str;
};

console.log('toDecimalFromCharset test 1:', toDecimalFromCharset('01000101', '01') === 69);
console.log('toDecimalFromCharset test 2:', toDecimalFromCharset('FF', hexCharset) === 255);
console.log('fromDecimalToCharset test:', fromDecimalToCharset(255, hexCharset) === 'FF');

const arbitraryCharset = length => new Array(length).fill(1).map((a, i) => String.fromCharCode(i));

// the Math.pow() bit is the possible number of values in the original
const simpleDetermineRadix = (strLength, originalCharsetSize, pressedLength) => nthRoot(Math.pow(originalCharsetSize, strLength), pressedLength);

// the simple ones only work for values that in decimal are so big before lack of precision messes things up
// pressedCharset.length must be >= pressedLength
const simpleCompress = (str, originalCharset, pressedCharset, pressedLength) =>
  fromDecimalToCharset(toDecimalFromCharset(str, originalCharset), pressedCharset);

const simpleExpand = (pressedStr, originalCharset, pressedCharset) =>
  fromDecimalToCharset(toDecimalFromCharset(pressedStr, pressedCharset), originalCharset);

const simpleNeededRadix = simpleDetermineRadix(str.length, hexCharset.length, pressedLength);
const simpleCompressedCharset = arbitraryCharset(simpleNeededRadix);
const simpleCompressed = simpleCompress(str, hexCharset, simpleCompressedCharset, pressedLength);
const simpleExpanded = simpleExpand(simpleCompressed, hexCharset, simpleCompressedCharset);

// Notice, it gets a little confused because of a lack of precision in the really big number.
console.log('Original string:', str, toDecimalFromCharset(str, hexCharset));
console.log('Simple Compressed:', simpleCompressed, toDecimalFromCharset(simpleCompressed, simpleCompressedCharset));
console.log('Simple Expanded:', simpleExpanded, toDecimalFromCharset(simpleExpanded, hexCharset));
console.log('Simple test:', simpleExpanded === str);

// Notice it works fine for smaller strings and/or charsets
const smallCompressed = simpleCompress(smallStr, hexCharset, simpleCompressedCharset, pressedLength);
const smallExpanded = simpleExpand(smallCompressed, hexCharset, simpleCompressedCharset);
console.log('Small string:', smallStr, toDecimalFromCharset(smallStr, hexCharset));
console.log('Small simple pressed:', smallCompressed, toDecimalFromCharset(smallCompressed, simpleCompressedCharset));
console.log('Small expaned:', smallExpanded, toDecimalFromCharset(smallExpanded, hexCharset));
console.log('Small test:', smallExpanded === smallStr);

// these will break the decimal up into smaller numbers with a max length of maxDigits
// it's a bit browser specific where the lack of precision is, so a smaller maxDigits
//  may make it safer
//
// note: charset may need to be a little bit bigger than what determineRadix decides, since we're
//  breaking the string up
// also note: we're breaking the string into parts based on the number of digits in it as a decimal
//  this will actually make each individual parts decimal length smaller, because of how numbers work,
//  but that's okay. If you have a charset just barely big enough because of other constraints, you'll
//  need to make this even more plicated to make sure it's perfect.
const partitionStringForCompress = (str, originalCharset) => {
    const numDigits = digitsInNumber(toDecimalFromCharset(str, originalCharset));
    const numParts = Math.ceil(numDigits / maxDigits);
    return partitionString(str, numParts);
}

const partitionedPartSize = (str, originalCharset) => {
    const parts = partitionStringForCompress(str, originalCharset);
    return Math.floor((pressedLength - parts.length - 1) / parts.length) + 1;
}

const determineRadix = (str, originalCharset, pressedLength) => {
    const parts = partitionStringForCompress(str, originalCharset);
    return Math.ceil(nthRoot(Math.pow(originalCharset.length, parts[0].length), partitionedPartSize(str, originalCharset)));
}

const press = (str, originalCharset, pressedCharset, pressedLength) => {
    const parts = partitionStringForCompress(str, originalCharset);
    const partSize = partitionedPartSize(str, originalCharset);
    return parts.map(part => simpleCompress(part, originalCharset, pressedCharset, partSize)).join(pressedCharset[pressedCharset.length-1]);
}

const expand = (pressedStr, originalCharset, pressedCharset) =>
    pressedStr.split(pressedCharset[pressedCharset.length-1])
     .map(part => simpleExpand(part, originalCharset, pressedCharset))
     .join('');

const neededRadix = determineRadix(str, hexCharset, pressedLength);
const pressedCharset = arbitraryCharset(neededRadix);
const pressed = press(str, hexCharset, pressedCharset, pressedLength);
const expanded = expand(pressed, hexCharset, pressedCharset);

console.log('String:', str, toDecimalFromCharset(str, hexCharset));
console.log('Neded radix size:', neededRadix); // bigger than normal because of how we're breaking it up... this could be improved if needed
console.log('Compressed:', pressed);
console.log('Expanded:', expanded);
console.log('Final test:', expanded === str);


To use the above specifically to answer the question, you would use:

const hexCharset = '0123456789ABCDEF';
const pressedCharset = arbitraryCharset(determineRadix(uuid, hexCharset));

// UUID to 15 characters
const pressed = press(uuid, hexCharset, pressedCharset, 15);

// 15 characters to UUID
const expanded = expanded(pressed, hexCharset, pressedCharset);

If there are problematic characters in the arbitrary, you'll have to do something to either filter those out, or hard-code a specific one. Just make sure all of the functions are deterministic (i.e., same result every time).

本文标签: javascriptHow do I shorten and expand a uuid to a 15 or less charactersStack Overflow