Dalvik Bytecode Acceleration Using Fetch/Decode Hardware Extension (Preprint)

  • Surachai Thongkaew
    Department of Communications and Computer Engineering, Tokyo Institute of Technology
  • Tsuyoshi Isshiki
    Department of Communications and Computer Engineering, Tokyo Institute of Technology
  • Dongju Li
    Department of Communications and Computer Engineering, Tokyo Institute of Technology
  • Hiroaki Kunieda
    Department of Communications and Computer Engineering, Tokyo Institute of Technology

この論文をさがす

抄録

The Dalvik virtual machine (Dalvik VM) is an essential piece of software that runs applications on the Android operating system. Android application programs are commonly written in the Java language and compiled to Java bytecode. The Java bytecode is converted to Dalvik bytecode (Dalvik Executable file) which is interpreted by the Dalvik VM on typical Android devices. The significant disadvantage of interpretation is a much slower speed of program execution compared to direct machine code execution on the host CPU. However, there are many techniques to improve the performance of Dalvik VM. A typical methodology is just-in-time compilation which converts frequently executed sequences of interpreted instruction to host machine code. Other methodologies include dedicated bytecode processors and architectural extension on existing processors. In this paper, we propose an alternative methodology, "Fetch & Decode Hardware Extension," to improve the performance of Dalvik VM. The Fetch & Decode Hardware Extension is a specially designed hardware component to fetch and decode Dalvik bytecode directly, while the core computations within the virtual registers are done by the optimized Dalvik bytecode software handler. The experimental results show the speed improvements on Arithmetic instructions, loop & conditional instructions and method invocation & return instructions, can be achieved up to 2.4x, 2.7x and 1.8x, respectively. The approximate size of the proposed hardware extension is 0.03mm2 (equivalent to 10.56 Kgate) and consumes additional power of only 0.23mW. The stated results are obtained from logic synthesis using the TSMC 90nm technology @ 200MHz clock frequency.------------------------------This is a preprint of an article intended for publication Journal ofInformation Processing(JIP). This preprint should not be cited. Thisarticle should be cited as: Journal of Information Processing Vol.23(2015) No.2 (online)------------------------------

収録刊行物

詳細情報 詳細情報について

  • CRID
    1572543027753790592
  • NII論文ID
    110009877386
  • NII書誌ID
    AN00116647
  • ISSN
    03875806
  • 本文言語コード
    en
  • データソース種別
    • CiNii Articles

問題の指摘

ページトップへ