xml - Label:- XMLContent De-duplication -


question 1---> working on project in translate english content other 17 languages. reduce translation cost using md5 hashcode , based on results decide whether topic new(master) or translated earlier(obselete). logic complicated , want reduce complexity level. using content management system filenet , way older..:) need best suggestion content de-duplication apart md5 hashing

note :- topic means xml file images , rendered via xslt , not dita standard.

question 2--->

what best alternative render non-standard xml file or not dita standard xml file on ui html or pdf.?

thanks in adavance ...waiting best suggestions.

question 1

i recommend not rely on hashes or time stamps, depends on environment. if refactor variables, change indentation add/remove comments, etc. not change content , should not trigger translation process, may rely on metadata trigger semi-automatic process. further on, use diffing mechanism compare current version of document earlier one.

question 2

as first question, 1 hard answer without knowing environment, too. smarter firstly convert files dita or markdown , use dita-ot or markdown processor further transformation.


Comments

Popular posts from this blog

python - Healpy: From Data to Healpix map -

c - Bitwise operation with (signed) enum value -

xslt - Unnest parent nodes by child node -