What is it
----------
Short version: 
A patch that fixes a glitch in the BIOS of some Intel VGA onboard graphics
adapters (i.e. 4010U, 4340 i3 and 4200U i5, 4600...) that cause NTVDM 
startup to be delayed for 5-10 seconds and also causes slow text output 
and other INT10h operations (1char / second) in fullscreen mode.

Long version:
Recently I bought a new Mainboard with an Intel 4600 Onboard Graphics adapter
card. When starting the NTVDM on this machine, I encountered a 5-10 second 
delay on NTVDM startup. But that wasn't all, things got even worse when trying
to switch to Textmode Fullscreen. The character output was 1 char/second, 
which is obviously an unacceptable speed.
So I researched the Internet and found 2 threads in the product forums of
Microsoft and Intel:

https://social.technet.microsoft.com/Forums/lync/en-US/f4882880-d9eb-4ab0-bfca-e592b27fb86e/ntvdm-slow-startup?forum=w7itproappcompat
https://communities.intel.com/message/234383

Microsoft didn't care to answer to the problem at all and in Intel forums
someone from Intel just answered that their boards were not tested for 16bit
compatibility and therefore they don't care about the problem (which I cannot
really believe, as if that would be true, text output in BIOS or Linux console
would also not work properly).
Some people in the Intel-forum reported that they now had to order a bunch
of new graphics cards just because of this bug!
The sheer ignorance of both companies regarding this problem really upset me 
and so it was my turn to investigate the problem and fix it, like many times
before :(
First, I needed to find out whom to blame for this mess. Intel or Microsoft?
Interestingly, they are both to blame in a certain way, but a problem with
NTVDM suprisingly is even benifical for solving the Intel video BIOS bug.
Sounds a bit confusing? You can read more in the technical details below.
Microsoft's fault is to not implement IN/OUT on 32bit Ports in NTVDM. But
even if that had been implemented, Intel's BIOS wouldn't work properly, it
would even cause more problems. Intel on the other hand expects MMIO ports to
be accessinble which isn't the cause in a V86 environment like the NTVDM.
Therefore it runs in a 1 second timeout on every operation, even though it
seems to work fine without using these MMIO-Ports. So their fault is to
try MMIO access for 1 second each time instead of just skipping over it if it
fails for unknown reasons.
The problem in fact can be solved by jumping over the check routine, which
is just 1 byte to patch in the Video BIOS.
Fortunately the NTVDM mapping of the BIOS address space can be patched in
memory via a NTVDM extension DLL, so that you don't need to reflash your VGA
BIOS.

System Requirements
-------------------
Should work on any Windows-PC with affected Intel Onboard Graphics board and
NTVDM.

How to Install
--------------
Go to the bin-Directory, right click on intelvidfix.inf and click install.

How to Uninstall
----------------
Open command shell as Administrator (rightclick on Command Prompt in 
Start/Programs/Accessories, Run as Administrator)

Go to bin directory and then:
RUNDLL32.EXE SETUPAPI.DLL,InstallHinfSection DefaultUninstall 132 intelvidfix.inf

Compiling
---------
You need the NTDDK Headers to compile.
The application was compiled using Microsoft Visual C 6.0
Just use nmake or use the VC 6 Project file.

Technical details
-----------------
Now how did I find out what was going on here?
Obviously it was a problem with the Video BIOS, as the real Video BIOS only
gets executed when the DOS application is running in full screen mode,
otherwise it just BOPs to the VGA emulation of the NTVDM for windowed
mode. So we know that INT 10h is responsible for doing Video IO.
Therefore I traced through a simple call to INT10h for outputting a
character with the excellent DEBUGX.COM utility from the FreeDOS project
(http://www.freedos.org/software/?prog=debug) which in contrast to
Microsoft's classic DEBUG.EXE also supports 32bit code like used in the
Intel VGA BIOS and many new BIOSes lately. I cut out the register content
view on places where it's not interesting for a better overview:

-rx
386 regs on  
-a
1628:0100 mov ax, 0e20
1628:0103 
-p
DPMI entry hooked, new entry=0F61:02BC
1628:0103 0000              ADD     [BX+SI],AL                       DS:0000=CD
-t =C000:0014 1000
EAX=00000E20 EBX=00000000 ECX=00000000 EDX=00000000 ESP=0000FFFE EBP=00000000
ESI=00000000 EDI=00000000 EIP=000028E8 EFL=00033286 NV UP EI NG NZ NA PE NC
DS=1628 ES=1628 SS=1628 CS=C000 FS=0000 GS=0000
C000:28E8 FB                STI
C000:28E9 FC                CLD

....snip....

C000:28F9 8BEC              MOV     BP,SP
C000:28FB E8F72E            CALL    57F5

C000:57F9 66BE00540400      MOV     ESI,00045400
C000:57FF E88211            CALL    6984

C000:6984 51                PUSH    CX
C000:6985 B502              MOV     CH,02
C000:6987 6652              PUSH    EDX
C000:6989 6650              PUSH    EAX
C000:698B 2E8B161AE3        MOV     DX,CS:[E31A]                   CS:E31A=F000
C000:6990 668BC6            MOV     EAX,ESI
C000:6993 8AC8              MOV     CL,AL
C000:6995 24FC              AND     AL,FC
EAX=00045400 EBX=00000000 ECX=00000200 EDX=0000F000 ESP=0000FFCA EBP=0000FFE0
ESI=00045400 EDI=00000000 EIP=00006997 EFL=00033246 NV UP EI PL ZR NA PE NC
DS=1628 ES=1628 SS=1628 CS=C000 FS=0000 GS=0000
C000:6997 66EF              OUT     DX,EAX	; Output 45400 on Port F000
EAX=00045400 EBX=00000000 ECX=00000200 EDX=0000F004 ESP=0000FFCA EBP=0000FFE0
ESI=00045400 EDI=00000000 EIP=0000699E EFL=00033246 NV UP EI PL ZR NA PE NC
DS=1628 ES=1628 SS=1628 CS=C000 FS=0000 GS=0000
C000:699E 66ED              IN      EAX,DX	; Read from port F004 to EAX
EAX=0004FFFF EBX=00000000 ECX=00000200 EDX=0000F004 ESP=0000FFCA EBP=0000FFE0
ESI=00045400 EDI=00000000 EIP=000069A3 EFL=00033246 NV UP EI PL ZR NA PE NC
DS=1628 ES=1628 SS=1628 CS=C000 FS=0000 GS=0000
C000:69A3 C0E103            SHL     CL,03	; Intersting, only AX is filled
C000:69A6 66D3E8            SHR     EAX,CL	; and not entire EAX! So this
C000:69A9 6692              XCHG    EAX,EDX	; is the MS 32bit IO bug
EAX=0000F004 EBX=00000000 ECX=00000200 EDX=0004FFFF ESP=0000FFCA EBP=0000FFE0
ESI=00045400 EDI=00000000 EIP=000069AB EFL=00033246 NV UP EI PL ZR NA PE NC
DS=1628 ES=1628 SS=1628 CS=C000 FS=0000 GS=0000
C000:69AB 6658              POP     EAX
C000:69AD 8AC2              MOV     AL,DL
C000:69AF 80FD00            CMP     CH,00
C000:69B2 740A              JZ      69BE
C000:69B4 8BC2              MOV     AX,DX
C000:69B6 80FD01            CMP     CH,01
C000:69B9 7403              JZ      69BE
C000:69BB 668BC2            MOV     EAX,EDX
C000:69BE 665A              POP     EDX
C000:69C0 59                POP     CX
C000:69C1 C3                RET

EAX=0004FFFF EBX=00000000 ECX=00000000 EDX=00000000 ESP=0000FFD6 EBP=0000FFE0
ESI=00045400 EDI=00000000 EIP=00005802 EFL=00033202 NV UP EI PL NZ NA PO NC
DS=1628 ES=1628 SS=1628 CS=C000 FS=0000 GS=0000
C000:5802 6683F8FF          CMP     EAX,-01	; Here it is compared with 
C000:5806 665E              POP     ESI		; FFFFFFFF, which fails and
C000:5808 6658              POP     EAX		; wouldn't fail if MS
C000:580A C3                RET			; implemented 32bit IO support

C000:28FE 0F848E00          JZ      2990	; So no jump here, instead it
C000:2902 E81E0B            CALL    3423	; continues....

...snip...

C000:343C E89D38            CALL    6CDC	; Not so important for us
C000:343F B9E803            MOV     CX,03E8	; Init loop counter with 1000!
C000:3442 E83F35            CALL    6984	; we already know function 6984
EAX=0004FFFF EBX=00000000 ECX=000003E8 EDX=80000000 ESP=0000FFD0 EBP=0000FFE0
ESI=00045400 EDI=00000000 EIP=00003445 EFL=00033202 NV UP EI PL NZ NA PO NC
DS=1628 ES=1628 SS=1628 CS=C000 FS=0000 GS=0000
C000:3445 66A900000040      TEST    EAX,40000000; High word remains 0004
C000:344B 750A              JNZ     3457	; So we don't get out of here
C000:344D 51                PUSH    CX		; until loop is over
C000:344E B90100            MOV     CX,0001	; Sleep for 1ms
C000:3451 E8842E            CALL    62D8	; Sleep routine
C000:3454 59                POP     CX
EAX=0004FFFF EBX=00000000 ECX=000003E8 EDX=80000000 ESP=0000FFD0 EBP=0000FFE0
ESI=00045400 EDI=00000000 EIP=00003455 EFL=00033216 NV UP EI PL NZ AC PE NC
DS=1628 ES=1628 SS=1628 CS=C000 FS=0000 GS=0000
C000:3455 E2EB              LOOPW   3442	; And it loops another 999 times
...


So first of all, what is this dubois Port at F000?
According to 2nd Generation Intel Core Processor Family Desktop Datasheet, 
Vol. 2 - Section: Device 2 I/O Registers:

F000 is the MMIO Address Register
F004 is the MMIO Data Register

They are specific for the Intel-Board and are not available to our NTVDM
emulation and therefore cannot be accessed, always returning FF on read.
Now first thing that comes to our mind when just seeing the loop that loops
for 1000 times is why the VGA BIOS is assuming that these non-Standard 
I/O ports are there and causing a loop for 1000 iterations for every read 
attempt. That doesn't look very smart and is the cause for the huge delay.
But then we can see that it's actually checked above if these ports are 
there and if not (so if EAX is FFFFFFFF), they are not used and code should 
jump over. This should theoretically be the case, except that the IN to the EAX
register just fills the lower AX register and leaves the high word of the 
register untouched. So this leads us to believe that this is a NTVDM problem
and not a BIOS bug.
When digging through the NTVDM, it can be seen that they just didn't implement
IN/OUT for 32bit registers but just 8 and 16 bit. I even made a patch for 
NTVDM to support 32bit IN/OUT (yes, it's possible to patch it in order to
do so, if you are interested in the patch, just contact me), but then I 
realized that this maked things even worse. Why?
Because of the destination of the Jump, if the check for FFFFFFFF succeeds:

C000:28FE 0F848E00          JZ      2990

This jumps to the END of the interrupt routine so in this case it doesn't
do anything at all!
From testing the application in fulscreen operation, I was able to see that
it currently does its operations very slowly, but it does them fine. So the
correct way to fix this problem is not to teach NTVDM how to handle 32bit
IN/OUT port access, but instead just JMP out of the loop with the 1000
iterations and tadaa - Startup delay of NTVDM is gone and fullscreen 
operations work normally!
So we just change:

C000:344B 750A              JNZ     3457	

to

C000:344B EB0A              JMP     3457	

In order to accomplish this, a litte DLL is generated that is loaded as a
VDD (Virtual Device Driver) for NTVDM. Memory for the VGA BIOS is mapped at
address C0000 in process address space and is mapped R/W, so DLL can access
memory there and just change 75 to EB and done.
So I don't know if the address layout of all the Intel VGA BIOSses is the
same as mine, I also impemented a pattern matching scan that searches the
VGA BIOS area for the asm code ocmbination at this place and tries to patch
it if pattern is found. I don't know every Intel VGA BIOS, I just had the one
from my card, but for this it worked fine. 
If it doesn't fix it for your card, just send me an e-mail and I will give
you instructions how to dump your VGA BIOS so that I can analyze it and
update my patcher to also work with your VGA BIOS version.

For further information, just contact me.

Author
------
Dipl.-Ing. (FH) Ludwig Ertl
mailto: ludwig.ertl@dose.0wnz.at

You can get my other useful DOS-Utilities here:
http://www.waldbauer.com/tmp/reference.php

License
-------
This Application and Sourcecode may be distributed freely.
The Sourcecode is Licensed in Terms of the GNU GPL v3.
See: http://www.gnu.org/licenses/gpl.html

Thanks
------
Thanks to all the users who are still using DOS Applications ;)
